For years I’ve contrasted depth-first recursive build technologies (Maven, Gradle, MSBuild) with directed acyclic graph (DAG) ones (Blaze/Bazel, Buck, Nx). I keep a small simulation repo, google-monorepo-sim, to make the difference concrete rather than hand-wavy. This entry walks through it.

The sim used to be a pile of .compile.sh / .tests.sh / .dist.sh bash scripts with a shared-build-scripts/ folder of common logic and a .buildStepsDoneLastExecution file for tracking what had already run. That’s all gone now. The DAG side is built with Aether Build (aeb), a small polyglot build system written in Aether. It’s still crude and teachable in a way that compares to Bazel/Buck/Nx.

Multi-module build systems per language

Language Depth-First Recursive Directed Acyclic Graphs (leaf-first) Other / Community Tools
Python setuptools, Poetry, SCons Pants Hatch, PDM
C++   CMake, Ninja, Bazel, Buck Meson, Tup
Java Maven, Gradle, Ant Bazel, Buck  
C   Make, CMake, Bazel Meson, Ninja, Autotools
C# MSBuild, Cake, NAnt   Fake (F#), dotnet CLI
JavaScript npm/yarn Webpack, Rollup, Nx Vite, Gulp, Grunt, esbuild
Go   go build, Make, Task, Bazel Mage
PHP Composer, Phing   Robo, Deployer
Ruby Rake, Bundler   Thor, Hoe
Swift SwiftPM, Xcode Bazel Tuist
Kotlin Gradle, Maven Bazel  
TypeScript   tsc, Webpack, Rollup esbuild, Vite
R R CMD build, devtools Make  
Rust Cargo Buck, Bazel, Make  
Scala sbt, Gradle, Maven Pants, Bazel Mill
Elixir Mix   Rebar (from Erlang), Bake
Haskell Stack, Cabal Bazel Shake
Dart pub Bazel (via rules_dart) build_runner
Erlang Rebar3   erlang.mk
Zig zig build system   CMake (rare, via wrappers)

Two anemic applications

We’re going to build two contrived applications: DirectedGraphBuildSystemsAreCool and MonoreposRule. All they do is print to STDOUT then exit. Each only needs the letters in its own class name, which is what gives us a nice non-trivial dependency graph without any actual business logic to get in the way.

Components that the applications use

The components are categorized into modules, each representing a type of phonetic sound:

  • Vowels: A, E, I, O, U.
  • Nasal: M and N.
  • Voiceless: P and T.
  • Sonorants: L and R.
  • Fricatives: S.
  • Labiodental: V and W.
  • Glides: H, J, and Y.
  • Sibilants: Q, X, and Z.
  • Velar: G and K.
  • Voiced: B and D.

Since the last time I wrote this up, the sim went polyglot. It’s now five languages, and some of the components crossed a language boundary:

  • vowelbase bridges Java and Rust via JNI. The VowelBase Java class loads a native library libvowelbase.so, implemented in Rust, whose printString wraps a string in parentheses. A gratuitous use of Java-invoking-Rust, but it demonstrates a cross-language edge in the graph.
  • nasal now bridges Java and Go. The Go module compiles c-shared to libgonasal.so, and the Java nasal component loads it via JNI.
  • sonorants (L, R) is now a Kotlin module rather than Java.
  • There are also TypeScript, C#, Python and even an Aether-language component in the tree for other demos, but the two apps above don’t depend on them.

The dependency graph

Here’s the class-dependency graph for the two apps. Each app (top) depends only on the letter classes in its own name; the letter classes live in phonetic-category component modules (middle); and two of those modules cross a language boundary via JNI - vowels through the Rust libvowelbase.so, nasal through the Go libgonasal.so - while sonorants is Kotlin. sibilants and labiodental are present in the tree but neither app needs them, so a DAG build never touches them.

Class dependency graph for the monorepo-sim apps

The DAG side: Aether Build

I’m not going to use Bazel (bazel.build, Google’s open-sourcing of Blaze) or Nx (nx.dev). I’m using aeb, which illustrates the two aspects of Blaze I most remember from my nearly two years in Google’s Test Mercenaries team:

  1. a leaf-first directed (acyclic) graph approach to building in a monorepo, and
  2. a changed-or-not check of inputs to skip individual steps.

Build files

Each module directory carries up to three declarative build files:

File Purpose
.build.ae Compile - declares deps, invokes the language compiler
.tests.ae Test - declares deps + test libs, compiles and runs tests
.dist.ae Package - builds a fat jar or other distributable

Here’s java/applications/monorepos_rule/.build.ae:

import build
import build (prereq)
import java

aeb(cap) {
    b = build.start()
    prereq(b, "jdk:21")
    build.dep(b, "java/components/fricatives/.build.ae")
    build.dep(b, "java/components/nasal/.build.ae")
    build.dep(b, "kotlin/components/sonorants/.build.ae")
    build.dep(b, "java/components/voiceless/.build.ae")
    build.dep(b, "java/components/vowels/.build.ae")
    java.javac(b)
}

Three things worth pointing at:

  • aeb(cap) is the entrypoint - not main(). The build receives a capability handle cap from the trusted aeb host (the same handle that backs runtime sandbox containment). A build file never constructs its own authority; it only receives it. (The legacy main() spelling still lowers to the same context-receiving entrypoint, but aeb(cap) is the convention the repo demonstrates.)
  • prereq(b, "jdk:21") declares the toolchain this node needs, OS-agnostically. A remote agent’s requester can read aeb --prereqs and pick, say, an aeb-tc:rust or aeb-tc:go image per dispatch - the bare host need not have every toolchain installed.
  • Dependencies are one build.dep(b, "path") per line - greppable, so the DAG can be extracted without compiling anything.

The cross-language edges are just more build.dep lines. java/components/vowelbase/.build.ae depends on rust/components/vowelbase/.build.ae; java/components/nasal/.build.ae depends on go/components/nasal/.build.ae. The Rust one is a cargo_project producing a cdylib; the Go one is a go_build in c-shared mode.

A single aeb invocation scans the relevant .build.ae / .tests.ae / .dist.ae files, topologically sorts the dependency graph, and runs everything in one process with an in-memory visited-module map. No per-step shell fork storm, no .buildStepsDoneLastExecution file.

First build

Picking the MonoreposRule app and asking for its distribution jar. First, a genuinely cold build — aeb keeps a content-addressed cache in ~/.aeb/cache, so to see everything actually compile I clear both that and target/:

rm -rf target ~/.aeb/cache
aeb java/applications/monorepos_rule/.dist.ae

The telemetry summary:

[telemetry]
  dist:    java/applications/monorepos_rule 0.00s [n/a]
  build:   java/applications/monorepos_rule 0.00s [miss]
  build:   java/components/fricatives       0.00s [miss]
  build:   java/components/nasal            0.00s [miss]
  build:   kotlin/components/sonorants      0.00s [miss]
  build:   java/components/voiceless        0.00s [miss]
  build:   java/components/vowels           0.00s [miss]
  build:   go/components/nasal              0.00s [miss]
  build:   java/components/vowelbase        0.00s [miss]
  build:   rust/components/vowelbase        0.00s [miss]
  jni.crate: libs/rust/registry/vendor/jni    0.00s [n/a]
total: 15.80s wall
aeb: 9 compile + 1 dist + 0 test
  compile: java/components/fricatives
  compile: go/components/nasal
  compile: java/components/nasal
  compile: kotlin/components/sonorants
  compile: java/components/voiceless
  compile: rust/components/vowelbase
  compile: java/components/vowelbase
  compile: java/components/vowels
  compile: java/applications/monorepos_rule
  dist:    java/applications/monorepos_rule

Note what is not there. The MonoreposRule app only needs the letters M O N O R E P O S R U L E, so aeb built exactly the nine modules feeding those letters - Java fricatives/nasal/voiceless/ vowels/vowelbase, Kotlin sonorants, Go nasal, Rust vowelbase (plus the vendored jni crate)

  • and the app itself. sibilants, labiodental, consonants, glides, velar, voiced are all present in the filesystem but irrelevant to this target, so they never compiled. That skipping of present-but-not-pertinent source is the defining trait of a DAG build system. Five languages, one command, one linked binary orchestrating the lot.

Running it again

Run the same command a second time and every node is a cache hit:

[telemetry]
  dist:    java/applications/monorepos_rule 0.00s [n/a]
  build:   java/applications/monorepos_rule 0.00s [hit]
  build:   java/components/fricatives       0.00s [hit]
  build:   java/components/nasal            0.00s [hit]
  build:   kotlin/components/sonorants      0.00s [hit]
  build:   java/components/voiceless        0.00s [hit]
  build:   java/components/vowels           0.00s [hit]
  build:   go/components/nasal              0.00s [hit]
  build:   java/components/vowelbase        0.00s [hit]
  build:   rust/components/vowelbase        0.00s [hit]
  jni.crate: libs/rust/registry/vendor/jni    0.00s [n/a]
total: 3.20s wall

15.8s cold, ~3.2s warm. The cache is keyed on the digest of the inputs, not file timestamps - a meaningful upgrade over the old bash sim, which used modification times. Because the cache lives in ~/.aeb/cache and not under target/, blowing away target/ alone still yields all-hits; you have to clear the cache to force real recompilation. [hit] / [miss] / [n/a] in the telemetry is the whole incrementality story, right there.

Running the tests

aeb javatests/applications/monorepos_rule/.tests.ae
[telemetry]
  tests:   javatests/applications/monorepos_rule 0.00s [miss] 1/1 PASS
  build:   java/applications/monorepos_rule 0.00s [hit]
  junit.jar: libs/java/junit                  0.00s [n/a]
  hamcrest.jar: libs/java/hamcrest               0.00s [n/a]
  ...
total: 4.16s wall

The .tests.ae file depends on the app’s .build.ae plus the vendored JUnit and Hamcrest jars, then runs java.javac_test and java.junit. Because the app’s compile is already cached, only the test compile-and-run actually happens.

Running the app

The .dist.ae step told us how to launch it. The fat jar carries all 1000-odd classes plus both native libraries (libvowelbase.so, libgonasal.so) at the jar root, so:

java -Djava.library.path=. -jar monorepos-rule.jar
main() .. MonoreposRule instance created:
<M>(O)<N>(O){R}(E)P(O)S{R}(U){L}(E)
Key: (vowels via Rust), <nasal via Go>, {sonorants via Kotlin}, all others pure Java

(...) are vowels routed through the Rust JNI library, <...> are the nasal letters routed through the Go shared library, {...} are sonorants from the Kotlin module, and the bare letters are plain Java. That single line of stdout is exercising four languages linked into one process. Then it exits. That is all these apps do.

Building the other application

aeb java/applications/directed_graph_build_systems_are_cool/.dist.ae

DirectedGraphBuildSystemsAreCool needs a bigger alphabet (consonants, glides, velar, voiced come into play), so its graph pulls in more modules - 13 compile targets versus 9. But everything it shares with MonoreposRule - fricatives, nasal, sonorants, voiceless, vowels, vowelbase and its Rust/Go natives - is already a cache hit from the previous build. Only the genuinely new modules compile. The fat jar it produces contains that app’s deps and none of the MonoreposRule-only ones. It runs the same way:

main() .. DirectedGraphBuildSystemsAreCool instance created:
D(I){R}(E)CT(E)DG{R}(A)PHB(U)(I){L}DSYST(E)<M>S(A){R}(E)C(O)(O){L}
Key: (vowels via Rust), <nasal via Go>, {sonorants via Kotlin}, all others pure Java

The depth-first recursive side: Maven

The sim’s depth-first_recursive_modular_monorepo branch has the same Java and Rust sources, in a classic modular layout, built by Maven.

In a typical Maven build, modules are built from the root, resolving and building dependencies in the right sequence. Maven encodes its build instructions in XML, one pom.xml per module. It traverses each module to its full depth of submodules before moving to the next sibling, in order to understand the entire implicit graph. Intelligence from that graph lets Maven reorder modules so the most-depended-on ones compile and test first. That’s a depth-first visitation.

Note: This branch is significantly out of date now. I had hope to make it a solid peer of the aeb-using one but that has not happened.

A full clean build

./quieter-mvn clean package

(that shell script is just mvn clean package with fewer lines of output.) Maven visits every module, running clean, compile, test-compile, test and jar for each - dozens of steps - because package from the root touches the whole reactor. On my middle-of-the-pack AMD desktop that was about 11 seconds.

An incremental build (no clean)

./quieter-mvn package

Maven skips recompilation where sources haven’t changed (it compares source timestamps against compiled classes - Rust included). But it still enters the compile and Surefire plugins for every module and still executes tests regardless of whether that module’s sources changed, because a linked dependency may have. The skip of the actual do-compile / run-tests happens inside the plugin, so the modules still get listed. About 5.5 seconds - roughly half the clean run.

Focusing with -pl and -am

./quieter-mvn package -pl applications/monorepos_rule -am

-pl (project list) names the module to build; -am (also-make) ensures its dependencies build too. Maven still visits every module however deep to build understanding, then subsets the reactor to the target and its deps - so sibilants and friends are skipped. This is the closest Maven gets to the DAG tools’ behaviour, and its build time here matches the equivalent focused aeb invocation.

Maven’s Reactor is the part that schedules module builds by dependency order and runs the lifecycle phases - compile, test, package - within each. The key contrast with aeb: Maven’s reactor reads the whole source base up front to plan, whereas the DAG side has no scheduler surveying everything first - it walks out from the requested target through its declared deps.

Comparing features

In a larger company’s monorepo, where the intent is to share code at the source level and minimize reliance on a binary repository like Nexus or Artifactory, you’d reach for a DAG build system — Buck, Bazel, or Nx - for:

  1. Incremental builds - track input changes, rebuild only what’s affected.
  2. Remote caching and execution - share artifacts across machines / teammates.
  3. Hermetic builds - same inputs, same outputs, regardless of environment.
  4. Parallel execution - independent modules across cores.
  5. Fine-grained dependency management - deps declared at file, not just module, granularity.
  6. Cross-language support - one graph spanning many languages.
  7. Reproducibility - the DAG makes builds repeatable.
  8. Scalability - handle huge numbers of modules and deps.
  9. Custom build rules - teams define their own build logic.
  10. CI/CD integration.
  11. Community and ecosystem.

My aeb-based sim, scored 0–5 against those:

  1. 4 - real content-addressed caching now (digest of inputs, not timestamps), though it still doesn’t do fine-grained sub-file dependency analysis.
  2. 1 - a shared local cache exists (~/.aeb/cache); no remote cache/execution.
  3. 2 - prereq() declarations move it toward hermeticity (toolchain selection per node), but it’s not sealed.
  4. 3 - single-process topo-sorted execution; independent nodes can run without ceremony.
  5. 3 - module-level deps; Bazel can depend on individually-listed sources, which I don’t.
  6. 5 - genuinely polyglot now: Java, Kotlin, Go, Rust and TypeScript in one graph, with JNI/FFI edges across language boundaries. This is the big change since the bash version.
  7. 3.
  8. 3 - the .build.ae files are far less repetitive than the old bash scripts thanks to the SDK’s javac() / kotlinc() / go_build() / cargo_build() / tsc() / shade() helpers.
  9. 2 - doable via the SDK; I haven’t shown a bespoke rule.
  10. 4 - nothing about this would trouble CI; the reporting could be prettier.
  11. 1 - it’s a teaching toy, not an ecosystem.

Circular references

Neither class of build system allows circular references at the module level - both fail early on a compile/test/package attempt. Some languages allow circular references between files (Laurel.javaHardy.java), but that lives inside a single compiler invocation (one module), which is a different thing.

Subsetting the checkout (expand/contract)

Google’s secret sauce.

The sim also demonstrates Git sparse-checkout, driven by Aether. A gcheckout tool recursively walks .build.ae / .tests.ae, extracts every transitive dep() / lib() / npm_dep() / cargo_dep(), and adds just those directories to the sparse checkout - so your working tree can hold one app and its deps and nothing else, exactly as Google’s in-house Piper-based monorepo lets you expand and contract. Earlier writing on this:

Interpreted config vs a compiled build program

It’s worth being precise about how the two DAG worlds turn build files into work, because Aeb and Bazel sit on opposite sides of a line here.

Bazel’s build files are written in Starlark - a deliberately-constrained Python dialect (no unbounded loops, no recursion, deterministic by construction). Each BUILD file is parsed to an AST, and that AST is then interpreted - tree-walked - by Bazel’s embedded Starlark interpreter. The output of that evaluation is data: the action graph Bazel’s engine subsequently schedules and executes. There is no native-code compilation of your build logic, and no static type checking of it; Starlark is dynamically typed and evaluated afresh on each invocation.

Aeb is a different pipeline. The .build.ae / .tests.ae / .dist.ae files are Aether source, and a single aeb invocation hands them (plus the build SDK - javac(), kotlinc(), go_build(), cargo_build(), tsc(), shade() and friends) to the Aether compiler (ae), which compiles and links the whole thing into one native binary and then executes it. The build logic itself becomes native code, not a graph fed to an interpreter. Because it goes through a real compiler, the build files are statically type-checked on the way through - that’s the Type checking completed with N warning(s) chatter you see scroll past during a build, catching (for instance) an unused variable in a build file before anything runs. The whole dependency graph is topologically sorted and driven from that single process with its in-memory visited-module map.

  Starlark (Bazel) Aeb (Aether)
Pipeline source → AST → interpret (tree-walk) source → compile + link to native binary → execute
Build logic is data producing an action graph a compiled native program
Type checking none (dynamically typed) static, at compile time
Conditionals if/elif/else - but only inside functions, and no way to run at “load” time beyond simple selects full if (and if-expressions), match dispatch, guarded function clauses, and a compile-time when static-if that only type-checks the taken arm
Loops for over a sequence, plus comprehensions - no while, no recursion (bounded by design) unrestricted while (Turing-complete), head/tail list destructuring, higher-order closures - it’s a real systems language
Execution interpreter re-evaluates BUILD files one linked binary, single process

Neither is “the right answer” - and it’s the sort of distinction that gets hand-waved as “they’re both just DAG build tools.” They parse and evaluate your intent very differently. The place people assume Bazel wins outright is containment: Starlark’s interpretation and Bazel’s sandboxed action execution buy hermeticity guarantees, and you’d expect a “little compiled binary” to make no attempt at that. But that assumption is where Aeb is actually most interesting.

Where this is heading: containment

Look again at that entrypoint - aeb(cap), not main(). The build receives a capability handle from the trusted host and never constructs its own authority. That isn’t cosmetic. Aether ships a whole containment model: a deny-by-default grant DSL (grant_fs_read, grant_tcp, grant_exec, grant_env), enforced at three layers - the compiler refusing capability imports under --emit=lib, hide / seal except at lexical-scope level, and an LD_PRELOAD shim (libaether_sandbox.so) that intercepts libc against the grant list, with a seccomp-bpf fence closing the raw-clone3/vfork process-creation bypass. The containment principle is Avalon-style Inversion of Control: the container wires the grants, the contained code merely receives them and can’t tell it’s boxed.

Point that at a build system and the picture almost writes itself. A build action is exactly a piece of code that should only touch its declared inputs, write only its declared outputs, and - for most targets - reach no network at all. The prereq(b, "rust:1.75") line already declares the toolchain a node needs OS-agnostically; the obvious next move is that each node runs under a sandbox scope derived from its dep() / prereq declarations: read access to exactly its sources and dependency artifacts, write access to exactly its target/ slot, exec of exactly its compiler, and nothing else - a build step that tries to phone home or read /etc/shadow mid-compile is denied by construction, not by policy review. That’s Bazel’s hermeticity, but arrived at from the language’s capability model rather than from a syscall-emulating action runner.

And the repo already leans this way. There’s a vision doc describing the large version - ingesting 141 strangers’ Todo-Backend implementations across 30+ languages into one DAG with hermetic per-implementation containers, cloud-agent fan-out, and supply-chain veto on untrusted code - and the scaffolding for it is already in the source: lib/sandbox/, lib/agent/, lib/provision/ and a veto/ tree. The toolchain-selection story (aeb --prereqs picking a per-dispatch aeb-tc:rust image) is the same capability handle flowing outward to a remote build agent. So my prediction is straightforward: the aeb(cap) handle grows from “the thing that runs your build files” into “the thing that contains them” - per-node grants, containerised remote execution, and a veto layer over untrusted third-party build files - and the compiled-binary model turns out to be an advantage here, because the sandbox is the same one the language already enforces at compile time, not a separate runtime wrapper bolted on.

The honest caveat: today none of that is wired into the build path I ran above - aeb(cap) receives the handle but the per-node grant plumbing is vision, not shipped. Bazel’s sandboxing is real and running now; Aeb’s is a credible trajectory with the primitives already in the tree. But “they’re both just DAG build tools” misses that one of them is a config language that produces an action graph, and the other is a capability-secure systems language that could end up being the sandbox.

A note on scale

The modules here are not representative of enterprise sizes - each is a handful of tiny classes with a single placeholder test. In reality there could be hundreds of sources per module and tens of seconds of compile and test each. For Google’s thousands of applications and libraries with high test coverage, a from-root build of everything could take days on one workstation - and is never done, even with Blaze, given the trunk checkout was 90 GB back in 2012 and no dev workstation ever pulled all of it. That’s exactly why expand/contract and a leaf-first DAG matter.

This repo goes hand in hand with my book, Trunk-Based Development and Branch by Abstraction, and a short video about it … though that last is the older shell-script version thats now deleted even if the experience was the same.



Published

July 2nd, 2026
Reads:

Categories