Monorepos with recursive or directed-graph build technologies?

Sam’s Twitter conversation with me and Simon Stewart earlier this week is still generating conversation. I thought I’d drill in to the build-technology choices you have for Monorepos.

	Directed Graph	Recursive
Build from which directory?	Always the base dir of checkout	Base or module directory
How to build one module only?	Implicate it from base dir	Be in the directory in question
Third party dependencies?	Declared once for all	Declared per module
Third party dep upgrade idiom?	Lockstep upgrades	Piecemeal upgrades
Module inter-dependencies?	Full path from base declared inline	Declared by name in dependant module
Build tech can intelligently skip modules?	By design, yes	Yes, with extra build flags
References to parent modules are -	implicit ‘..’ directory refs within source tree	acquired from artifact cache/repo by group/name/version
If you checked out a sub-module only?	You could not build it	You could build it
Required repo organization?	Very organized, very consistent	Modestly organized and possible inconsistency a module team’s choice
RAM-style build cache?	Fine-grained would be possible but hard	Fine-grained possible but very hard
Monorepo scales up to?	Tens of thousands of committers depending on VCS choice	Hundreds of committers
Version number of built products	Lockstep versioning, or unversioned	Lockstep versioning or per-module versioning
Can be used with large team sizes?	Yes	Yes
Circular modular dependencies allowed?	No	No
Circular dependencies hidden via third-party deps?	Nearly impossible	Possible
Requires you to do lockstep deployments?	No	No
Cross module atomic commits possible?	Yes	Yes

Buck, Bazel (nee Blaze), Pants, Selenium’s CrazyFun are all directed graph build technologies.

Maven and Gradle (for Java at least) are the two main recursive build technologies.

Lockstep upgrades

The dividing feature you will most be affected by is lockstep upgrades. With the directed graph build systems are going to make you upgrade a dep for all modules and all times in one go. That seems frightening and an upfront cost but it is totally worth it. One pair of committers is going to upgrade “log4j” for everyone in one commit, and run a very big pre-commit build to ensure that nobody is going to complain.

With the recursive build systems, things like “log4j” could be upgraded in different modules at different times. Doing an upgrade on a module by module basis is going to be easier to complete (Google once struggled to upgrade JUnit from 3.8 to 4.x in a single commit). Easier to complete - great - but what if you didn’t complete? What if you observe there’s a spread of declared versions of a dep. Because “the business” doesn’t know how to price up “technical ~~debt~~ damage” in a codebase, that spread could never get addressed. You can also still get the spread if you attempt to centralize the declarations of dependencies in recursive build tools, or at least small complexities in use.

The problem is “COULD” is “HAS ALREADY” for 99.5% of enterprises, and they almost never pull back from that situation. Lockstep upgrades are totally worth it. Maven, Gradle and alike should gain the equivalent of a “–no-dependency-variance” build flag, to allow builds to fail fast in a spread situation.

← Previous Archive Next →

Published

March 28^th, 2017

Reads:

Paul Hammant's Blog: Monorepos with recursive or directed-graph build technologies?

Lockstep upgrades

Published

Tags

Categories