Monorepos vs Megarepos

I’ve been blogging about Google Monorepo for a number of years. Indeed, for a number of years before the practice of doing Trunk-Based Development in a single trunk/master branch for the whole dev org regardless of dev-team became known as “Monorepos”.

What’s a Megarepo then? Well, it is the same idea but without the dev org doing a single from-root Trunk and all integrating there.

Consider a source tree:

app1/
  src/
    prod/
    test/
  BUILDFILE
app2/
  src/
    prod/
    test/
  BUILDFILE
svc3/
  src/
    prod/
    test/
  BUILDFILE
svc4/
  src/
    prod/
    test/
  BUILDFILE

In Git, and Mercurial you can only branch from root. You have to run a command to see what branch you’re working on. You could run “git status” to tell you which branch you’re on. If you switched branches you’d not see that represented in the directory structure of the working copy, you’d have to run “git status” again to confirm. Maybe you’ve configured your shell/terminal to keep reminding you which branch you’re on. Of course, Trunk-Based Development teams, if they branch at all, are only making short-lived feature branches (PR branches), and delete them quickly once they’ve been merged/integrated into trunk/master.

In a true Trunk-Based Development + Monorepo model, Subversion, Perforce or ClearCase teams may work this way (not quite so expanded):

branches/
  2019_04_Release/
    app1/
    app2/	
    svc3/	
    svc4/	
  2019_05_Release/  
    app1/
    app2/	
    svc3/	
    svc4/	
trunk/
  app1/
  app2/
  svc3/
  svc4/

With Subversion, Perforce, Clearcase and others though, you can branches anywhere. Yup, arbitrary branching is supported. Let’s look at the same source tree, but expanded a differently:

app1/
  trunk/
    src/
      prod/
      test/
    BUILDFILE
  branches/
    April_2019/
    May2019/
      src/
        prod/
        test/
      BUILDFILE
app2/
svc3/
svc4/
  branches/
    integ/
    orm_makeover/	
    team4/
      src/
        prod/
        test/
      BUILDFILE

This dev organization isn’t doing Monorepo, they’re doing a single repo, with all buildable and shippable/deployable pieces in it (we presume), but each team is choosing where they do their branching and how. Many enterprises are like this, and know they probably should get to TBD but not how. Let’s call that legacy corporate practice “mega repos” to differentiate it.

Companies leaping into being today, are doing multi-repos (popular with microservice teams) and one of a number of unapproved branching models*, or Monorepo with Trunk Based Development. If they’re doing the latter they’re doing a recursive build style of modularity (Maven and alike for non-Java platforms), or a directed graph build system like Google’s Bazel or Facebook’s Buck. Pros and cons to both. Anyway, these new startups going monorepo have a natural advantage to legacy enterprises stuck in their megarepos and the associated practices.

Note: Big historical open source portals like the Apache Software Foundation keep all separate projects in a single megarepo too (Subversion). At least historically. That was until until Git’s popularity meant that the ASF deployed Git infrastructure too for a multi-repo setup and allowed projects to choose where they wanted to be and when to migrate. Each that ended up in Git was mirrored to GitHub as that is popular too - still multi-repos though.

← Previous Archive Next →

Published

June 11^th, 2019

Reads:

Paul Hammant's Blog: Monorepos vs Megarepos

Published

Categories