Quicker Local Maven Builds

Maven is a depth-first recursive build technology. Projects can be multiple modules in one repo, and if ‘mvn install’ is launched from the root of that repo it will recurse through all the modules depth-first. It’ll do so regardless of what’s changed. There’s the rub: you may have only changed one source file in one module, and that may have been test logic, not ‘prod’ source. Meaning there’s a bunch of needless recursing to get to a “BUILD PASSED” conclusion. As we do this before we commit/push because we’re thorough, not lazy developers.

What’s needed is a faster Maven build for such ‘smaller’ changes. Not for minute-by-minute development, as you’ll just launch a specific test in your IDE for that, but for the I’m done situation. Specifically: Im done and I should run “the build” before checking in.

Here’s a python script that’ll attempt to build the smallest amount of modules for the pending changes (still using Maven):

	#!/bin/python3

	# See https://paulhammant.com/2019/10/20/quicker-local-maven-builds

	import sh, os
	from pathlib import Path

	log = sh.git.log("--oneline", "--no-color", "--decorate=short", _tty_out=False)
	hashLine = ""
	for line in log.split("\n"):
	if ' (origin/master,' in line:
	hashLine = line
	break
	if 'master, origin/master' in line:
	hashLine = line
	break

	if (len(hashLine) is 0):
	print("Could not determine origin/master SHA1")
	exit(1)
	originHash = hashLine.split(" ")[0]

	diffFiles = sh.git.diff("--name-status", originHash, _tty_out=False)
	allModules = []

	for filename in Path('.').glob('**/pom.xml'):
	allModules.append(str(filename).split("/pom.xml")[0])

	allModulesWithSourceTrees = []

	for filename in Path('.').glob('**/src/main'):
	srcDir = str(filename).split("/src/main")[0]
	if srcDir in allModules and srcDir not in allModulesWithSourceTrees:
	allModulesWithSourceTrees.append(srcDir)

	impactedProdModules = []
	impactedTestModules = []

	for diffFile in diffFiles:
	file = diffFile.split("\t")[1].strip()
	for m in allModulesWithSourceTrees:
	if file.startswith(m) and "src/test/" not in file and m not in impactedProdModules:
	impactedProdModules.append(m)
	impactedTestModules.append(m)
	if file.startswith(m) and "src/test/" in file and m not in impactedTestModules:
	impactedTestModules.append(m)

	with open(".mvnCommands.sh", 'w') as out:
	out.write("#!/bin/sh\n\nset -e\n\n")
	out.write("mvn install -DskipTests -pl " + ",".join(impactedProdModules) + "\n\n")
	out.write("mvn test -Dmaven.main.skip -pl " + ",".join(impactedTestModules) + "\n\n")

	os.chmod(".mvnCommands.sh", 0o775)

	print(".mvnCommands.sh updated")

view raw genMavenImpactScript.py hosted with ❤ by GitHub

What this does first is look for the last commit SHA1 for origin/master. Using that as its baseline, it determines the local commits since, that as well as uncommitted code. It then uses that list in a calculation of changed Maven modules.

Changed Maven modules are going are sub-divided into ‘prod code’ and ‘test code’ changes. Prod changes need to be compiled, have their tests invoked, and made into jars. Test changes might not have associated prod code changes, and need to be compiled themselves, then their tests invoked. Modules that were not changed at all need neither. At least, if they were already built versus starting to make the changes in questions.

Thus, you’d do:

git pull
mvn clean install -DskipTests

Then you’d make your changes in Intellij (or your preferred editor). That could include a bunch of intermediate commits (or not).

Then you’d run the script:

python3 /path/to/genMavenImpactScript.py

Then you’d run the shell script it made:

.mvnCommands.sh

Indeed, you could keep running that if you knew you’d not pulled in more modules.

Note: You might want to put .mvnCommands.sh in your .gitignore file so that it is not checked in.

So Google’s Bazel is a directed graph build system that has smarts built into the BUILD files per package/dir in the repo structure. This allows a very efficient determination of minimal compilation set, as well as the minimal number of tests (Test Impact Analysis) to invoke. At its best for very large projects, it can be an order more efficient than Maven’s in its fully recursive mode. For smaller projects, it can be a mere 20-50% more efficient. There is complexity for Bazel though in that you have to work out what the target is before it calculates dependencies. here’s one for Selenium:

bazel test java/client/test/org/openqa/selenium/chrome

(see the Selenium wiki here)

The dependencies for that are other Java packages.

My script (subject to improvements for perfect Test Impact Analysis) could split the difference in build execution times and it is easy enough to invoke in that it does not have any parameters (yet).

← Previous Archive Next →

Published

October 20^th, 2019

Reads:

Paul Hammant's Blog: Quicker Local Maven Builds

Published

Categories