Paul Hammant's Blog: More on Depth-first recursive vs DAG build technologies
Well it turns out I was wrong in the blog entry a few days ago - Bazel doesn’t do a single compiler invocation for the set of impacted BUILD modules for a target. At least for Java. It invokes
javac once per BUILD module in most-depended-on oder towards the intention of the build target. On the way it makes mini libs (jars). I could be wrong about that too, if my Bazel skills are so bad that I missed built-in ways of working.
Remember PicoContainer from 16 years back? Well JetBrains do - it’s wedged inside their java-based IDEs. I’ve pulled it out here: github.com/picocontainer/PicoContainer1.9. JetBrains did a bunch of work based on what they pulled in from their forking of the PicoContainer 1.1. Generics work and more. They also change package structure to suit them. This isn’t so important. They kept the original author tags: Aslak Hellesøy, Jörg Schaible, Jon Tirsén, Mauro Talevi, Paul Hammant, Thomas Heller. A few more contributed, but we didn’t put them in author tags.
Back in the day, PicoContainer 1.1 had Maven as a build tech (depth-first-recursive). With this extraction, it now has three alternate build techs:
- Maven (depth-first recursive)
- Bazel (directed acyclic graph)
- Bash scripts executing vanilla Java JDK commands with variations (neither depth-first recursive nor DAG)
The source tree ALSO contains a source file that doesn’t compile. This is just for the purposes of this blog entry:
The Maven build masks out that source file via compiler-plugin config:
<excludes> <exclude>**/RedHerring.java</exclude> </excludes>
It is important to note that this maven build isn’t multi-module, so depth-first recursive doesn’t apply.
Bazel build notes nothing depends (a DAG remember) on the BUILD module that contains that source file so never builds it unless it is explicitly mentioned. Lots of trial and error to get the multiple Bazel BUILD files right. In this effort, I found unused_deps to be very helpful.
bazel build //src/java/com/intellij/util/pico:PicoContainer_deploy.jar # or bazel test //src/test/org/picocontainer/tests/integration:tests
I also have three shell scrips invoking javac, java and jar in order.
The first one chokes on the bad source file, by design. The remaining two builds (and the maven & bazel ones) make a jar of PicoContainer’s classes after running a single test. Two different strategies for picking source files. The “wise” one uses a build in feature of
javac that can work out what else needs to compiled in that invocation. The “masked” one uses “sed” to eliminate
RedHerring.java just before invoking the compiler.
Mucking around with Git’s Sparse Checkout
One of the features of Google’s Blaze that has not made it into Bazel yet, is the build-in subsetting of the source tree (monorepo nirvana).
Git’s “Sparse Checkout” feature came with Git 2.25 (Jan 2020). I had listed it as something that was needed in June 2019 but who knows whether the git leads read my blog entry. See also my VCS-nirvana update to the same list in Jan 2020.
We can play with sparse checkout to mimic Google monorepo life:
git sparse-checkout init --cone git sparse-checkout set third_party src/java/org/picocontainer/defaults "src/java/org/picocontainer/*.java" "src/java/org/picocontainer/BUILD" src/java/org/jetbrains src/java/com src/test README.md pom.xml "*.sh" WORKSPACE ".all*"
This elaborate modification to the checkout allows the
RedHerring.java to be ignored in all situations. Specifically
./naive_classic_build.sh build passes instead of fails. Whereas
naive_masked_build.sh asked out the reg herring via sed trick, this new way doesn’t need to exclude the dir/file as it is no longer there in the file system. It is still in the ./git/ backing store, but not in the “checkout”.
Git sparse checkout has ‘set’ for a big list as above. This smushes prior settings each time you use it. It also has ‘add’ which adds new patterns to the list it had before, instead of smushing. I think a ‘remove’ sub-command is needed. My case above could be just:
git sparse-checkout init --cone git sparse-checkout remove src/java/org/picocontainer/redherring
Mucking around this way, gives you a glimpse of what Google’s Blaze would readily do for committers in their monorepo subsetting down from many hundreds of different team’s permutation of directories for meaningful buildable deployables.
Note that the Maven build still works for the sparse checkout, as does
naive_masked_build.sh. Nothing changed from their point of view.
When you’ve finished mucking around, this resets you back to non-sparse.
git sparse-checkout disable
Also, in closing, I still think a nirvana monorepo build system would do a single invocation of the compiler where the target/intention supporting that.