A high-throughput team in Git is going to commit & push often, and teammates are going to need to code review as changes come in. Well, they do if they want to prevent the accumulation of tech debt, latent defects, standards avoidance, etc. When at ThoughtWorks I was placed at Google in the Test Mercenaries team up to the start of 2009, so I’ve some perspectives that are ten years out of date now, but I’ll give it a go of trying to detail the experience differences. Google’s setup was a couple of years established internally before GitHub launched with “Pull requests” a feature available when they moved from private beta to general availability.

Granularity Differences

Back then Google did a “continuous review” of commits that would land in master/trunk. They used Perforce (in-house Piper since 2012) and it didn’t have cost-free branches like Git, so these were patch-sets and a bunch of shell/python scripting. Guido van Rossum made Mondrian in 2005/6 to turn a command-line tech into a web one, and that’s where reviews were finalized. Indeed, it was also were bot determined data points were added. Bots that ran in infra that’s overgeneralized today as Continuous Integration (CI).

Googlers made as small commits as possible to allow faster code review for peers. GitHub happily rolls up many commits into a single code review. A series of small commits in any codebase one of the habits of eXtreme Programming (XP) veterans. If a piece of work discovered a big-ass method/class rename, the XP veteran would git-stash, do the rename commit/push, then git-stash-pop to carry on with the planned piece of work. The rename was unplanned but needed. The code review for the rename should be a snap (especially if semantic merge takes off), and it would liberate the later code review from the noise of the rename.

I can see the pros and cons of on review per commit (and time separated), and one single review for a bunch of commits (at the same time).

Extra data points

Google’s system had loads of bot-injected intelligence for each commit. Those were Findbugs, Standards deviations, vulnerabilities, and code coverage changes. Obviously, also build passes/fails, and I might remind you that “the build” should extend to all functional verifications (unit tests, services tests, UI automation for Selenium or equivalent, etc). The key to success there is to have the bot-added data points come in BEFORE the early majority have arrived for code review. Indeed people happy to do code review is always a smaller number than people happy to write code. Thus teams should be prepared to keep stats on reviews and reviewing, then reward people for the right ratio and timeliness.

GitHub has a marketplace where you can buy a bunch of those bot-added data points. Buy can be of “seats” which is vague. Maybe that is because various vendors price for users using/seeing the data points and others price including the numbers of bots making the data points. Maybe it’s not that complicated, and its just me being picky.

The same marketplace lists commit/push build-automation services. Things popularly described as CI even if the tenets of CI are often trampled on. Some of the same build automation & bot technologies are available for setup outside the market place too. Yet more can be set up that are not at all listed in the GitHub marketplace (no commissions for GitHub). Some too are $0 per user. Those are either $0 for a small number of users (I mean seats), or are $0 because they’re wholly open source with no commercial service arm of the same dev team. Labors of love, perhaps.

Code Coverage Changes

(controversial topic noted)

To me, it feels too hard to get a code coverage change into the Pull-Request o GitHub.

That was easy for Google as they enforced a directed graph build system (Blaze to them, to us outside Google: Bazel). Blaze had to deal with four mainstream languages for Google: C++, Java, Python, and JavaScript. Soon after Go was added, and now there’s probably Kotlin, TypeScript, and Dart as mainstream. Niche languages included Objective-C for iOS apps (Google made “Maps” for the launch of the iPhone) and ActionScript (back then, but probably not now). Anyway, directed graph build systems have a rootwise invocation “compile” or “test” (which deps on compile). Google’s Blaze system could subset down to just changed/impacted bits and pieces quite easily. That’s important if you have nine million source files in one directory structure in one repo. Subsetting allowed data points to be added more quickly - and you really want those in before code-reviewers turn up. If you can’t achieve this you’re on a domino effect towards code reviews not being done or too late to be efficient.

Back to GitHub again: They are open to other people’s choice of build tool. There are 8-12 alternates for Java, 20-40 alternates or JavaScript/Node, there are maybe 4-7 for .NET (phew!), and so on. Even within build technologies there’s more choices for coverage calculation. Most build technologies are depth-first recursive so collected coverage data can’t be easy to aggregate into something meaningful. If your build recursed into eight modules, and each had it’s own coverage number, then you’d want to display the changes in coverage for all the modules impacted by the commit. Right now in 2019, teams will most likely have to add their own pipeline steps to collect coverage data. With that and the cooperation of the build-server/service vendor make output suitable for codereview and insert that into GitHub’s review UI. Bots to update datapoints in GitHub’s pull-request UI, but I’d hope for more UI/graphics flair than I’ve seen.

Testability Score

Miško Hevery (Googler, former “Test Mercenary” teammate, friend, and co-author of AngularJS) used some of his 20% time to make Testability Explorer. For about one month a few Java OSS teams competed for the top score before snapping out of it an getting back to regular work.

Later did Miško did some ActionScriptMiso to make a bubbles view of patronage of testing per commit - See Jon Wolter’s blog entry. I’ve tried to make an AngularJS version of the same.

So these feel like things that could be bot-added to Pull-Requests in order to increase the efficiency of code reviews. Sure I don’t need a timeline spanning more than the current commit and the previous one, but I’d like to know how much test code was worked on versus prod code. Miško’s idea was to simply count the lines-of-code churn in prod vs non-prod directories. I’d also like to know how “testable” the code is versus the previous commit.

GitHub challenges

Experience

I think GitHub’s part of the improvements needed is around the experience of the code review. All the types of data points that bots can add - are they able to added in the best way interactively and aesthetically. How much clicking is required to get to the place where an epiphany may happen? Now, much noise is woven into the presentation of the reviewable code. How easy is it to inject graphics/pics/charts into the PR? If that were easier, I might expect my build automation provider to do much of the heavy lifting for that, not a wholly separate service for/via GitHub. It would make more sense do to convoluted docker/VM/process choreography once in the build-automation platform than do it multiple times for each vendor’s GitHub plugin from the marketplace. Does that mean that each build-automation vendor has their own marketplace? And if they did what would GitHub think about that in terms of lost commissions? Through all of that, I’m reminded that tools for developers is a race to zero. And if services for developers isn’t yet a race to zero it’s only ten years behind the other. Note: Race to zero takes a bell curve and clips it on the right-hand side.

Terseness of reviews.

Git doesn’t do semantic diffs. At least, not yet. No doubt the community is rabidly against it and will be up to when it is delivered, then it will be rabidly for it. Note that many technologies have a rabid “fan boy” side.

If GitHub could add a data point could be added to a Pull Request like “99% chance this was a method rename”, with “click here to see the classic lines-added/changed diff” then that would change a lot. That older XP workflow of pausing work to do refactorings, and separating the commits for those from functional changes, might come back and drive the industry toward continuous review.

Note that the PlasticSCM people are way ahead with their semantic merge technology. I wonder too, if the likes of JetBrains begin to play here with their IDEs somehow providing metadata per commit to communicate the refactorings completed in each - something that GitHub could pick up on for presentation in code review.

PRs, Branches, Names

Sometimes you see people re-use the same PR branches for successive and unrelated reviews. Maybe that’s resurrecting the same branch name after a close-PR action deleted it in ‘origin’ prior to recreation. Sure it can be made to work, but it is error prone. Similarly, sometimes teammates re-use the same Pull Request name for unrelated activities. They’re still separate in GitHub, but other systems attempting to correlate activities might get confused.

Google’s commit-centric way of working seemed more streamlined.

Lost Intellectual Property

GitHub stores code reviews in a database, not in Git. When PR branches are deleted, you’ve either consumed (merged/integrated) the code into the master, or you’ve chucked it away. GitHub allow you some time to restore the branches associated with a closed PR, but that cant last forever. That’s small as issues go though. As the review commentary (and bot datapoints) are not in a Git repo, then it’s not eminently cloneable and pushable to GitLab or Gitea. There are ways to preserve such info long term, but it feels like locking the way it’s coded.

Slowness and Redundant Reviews

GitHub informs you of multiple repo-centric events pertinent to you in a ‘notifications’ section of the site. Also via messaging/emails and presumably slack-style integrations. If you’re in the web app, you’ll be clicking on pull-requests from team quite a lot. Well if you and your teammates are doing small stories, and small PRs. Your new modest angst can be because the back-button way of getting back to the list of notifications once you’d dipped in to one. To me it is a second or two to get into a PR from the notifications page and 1 to 1.5 secs to get back to the (hopefully smaller) list of notifications. That’s annoying enough for me to have a different workflow - use the browser’s “Open In New Tab” feature to batch open them all, then just visit each tab in turn and close them once processed.

Related - team mates might have fully reviewed each other’s PRs, and me showing up is not needed. I don’t know if things can disappear from notifications lists if some criteria is later met. I have some IMAP processing python that attempts to rollup my notifications, but it’s not that advanced either.

Context/Reminder

I hate the GitFlow branching model, and GitHub Flow is only different to the (much older) Trunk Based Development (TBD) it one small way. Sure go ahead with GitFlow or any other cockamamy branching model then call me to help you get to TBD. So, nothing I talk about here in this article will alleviate the problems with the GitFlow (or worse) branching model.

← Previous Archive Next →

Published

August 2^nd, 2019

Reads:

Paul Hammant's Blog: Pull-Request dissatisfaction