Paul Hammant's Blog: Google's vs Facebook's Trunk-Based Development
Update: See the new resource site for Trunk-Based Development called, err, TrunkBasedDevelopment.com and make sure to tell your colleagues about it and this high-throughput branching model.
I’ve been pushing this branching model for something like 14 years now. It’s nice to see Facebook say a little more about their Trunk-Based Development. Of course they’re not doing it because they read anything I wrote, as the practice isn’t mine, it’s been hanging around in the industry for many years, but always as bridesmaid so to speak.
If not Trunk, what?
Mainline?
Mainline as popularized by ClearCase is what we’re trying to kill. At least historically. It’s very different to Trunk-Based Development, and even having vastly improved merge tools doesn’t make it better - you still risk regressions, and huge nerves around ordering of releases.
ClearCase’s best-practices also foisted a ‘many repos’ (VOBs) on teams using it, and that courted the whole Conway’s law prophesy. I mentioned Conway’s Law before in Scaling Trunk-Based Development and it concerns undue self-importance of teams around arbitrary separations.
Multiple small repos for a DVCS?
There is a great statement by a Reddit user in the programming section of Reddit, in conjunction with the Facebook announcement:
This Redditor is right, there’s a lack of atomicity around a many-repos design, that stymies bisect. It could be that Git subtrees (not submodules) are a way of getting that back (thanks @chris_stevenson on a back channel). There’s also a real problem moving code easily between repos (with history) though @offbytwo (back channel again) points out that subtrees carefully used can help do that.
Trunk at Google vs Facebook
Tuesday’s announcement was from Facebook, and to give some balance, there’s deeper info on Google’s trunk design in: Google’s Scaled Trunk-Based Development.
Subsetting the trunk for checkouts
TL;DR: different
Google have many thousands of buildable and deployable things, which have very different release schedules. Facebook don’t as they substantially have the PHP web-app, and apps for iOS and Android in different repos. Well at least the main PHP web-app is in the Mercurial Trunk they talked about on Tuesday. I’m not sure how the iOS and Android apps are managed, but at least the Android one is outside the main trunk.
Google subset their trunk. I posted about that on Monday. In that article I pointed out that the checkout can grow (or shrink) depending on the nature of the change being undertaken. It’s very different to a multiple-small-repos design.
Facebook don’t subset their trunk on checkout, as they do not need to; The head revisions of everything in that trunk are not big enough for a C: drive or IDE to buckle. There’s also no compile stage for PHP, for regular development work.
Maximized Sharing of Code
TL;DR: the same
Code is shared using globbed directories within the source tree. It’s shared as source files, in situ, rather than classes in a jar (or equivalent).
Refactoring
TL;DR: the same
Developers take on refactorings where appropriate. Sure it means a bigger atomic commit, but knowing all the affected source is in front of you as you do the refactoring is comforting. At least, knowing that if Intellij (or Eclipse, etc) completes the refactoring there’s a very strong possibility that the build will stay green, and that you’re only going to have a slight impact on other people’s working copy, and only if they are concurrently editing the same files. Bigger refactoring probably still require a warning email.
Super tooling of the build phase
TL;DR: the same
Google have what amounts to a super-computer doing the compilation for them (all languages that are compiled). All developers and all CI daemons leverage it. And by effective super-computer, I mean previous-compiled bits and pieces are pulled out of an internal cloud-map-thing for source permutations that have been compiled before. The distributed hashmap is possibly LRU centric rather that everything forever.
Facebook don’t have that big hashmap of recently compiled bits and pieces, but they do have HipHop in the toolchain (originally a PHP to C++ compiler) which is interesting because, at face value, PHP is an interpreted language and ‘compile’ makes no sense. HipHop was created to reduce the server footprint and requirements for production deployments, while still being 100% functionally identical to interpreted PHP. It’s also faster in production. More recently HipHop became a virtual machine. It continues to be incrementally improved. Like Google, Facebook can measure cost-benefit of continued work on it (prod rack space & prod electricity vs developer salaries).
Source-Control weapons of choice
TL;DR: different
Google use Perforce for their trunk (with additional tooling), and many (but not all) developers use Git on their local workstation to gain local-branching with an inhouse developed bridge for interop with Perforce. Facebook uses Mercurial with additional tooling for the central server/repo. It’s unclear whether developers, by habit, exist with the Mercurial client software or use Git which can interop with Mercurial backends. Both Google and Facebook do Trunk-Based Development of course.
Branches & Merge Pain
TL;DR: the same
They don’t have merge pain, because as a rule developers are not merging to/from branches. At least up to the central repo’s server they are not. On workstations, developers may be merging to/from local branches, and rebasing when the push something that’s “done” back to the central repo.
Release engineers might cherry-pick defect fixes from time to time, but regular developers are not merging (you should not count to-working-copy merges)
Eating Own Dog-food
TL;DR: mostly different
All staff at Facebook use a not-live-yet version of the web-app for all of their communication, documentation, management etc. If there’s a bug everyone feels it - though Selenium2 functional tests and zillions of unit-tests guard against that happening too often.
Google has too many different apps for the team making each to be said to be a daily user of it. For example the AdSense developer may use a dog-food version of Gmail, but they are making AdSense, so are hardly hurting themselves as they are not minute by minute using the interface as part of their regular existence at Google.
Code Review
TL;DR: same
Both Google and Facebook insist on code reviews before the commit is accepted into the remote repo’s trunk for all others to use. There’s no mechanism of code review that’s more efficient or effective.
Google back in 2009 were pivoting incoming changes to the trunk around the code-review process managed by Mondrian. I wrote about that in “Continuous Review #1” in December. I think they are unchanged in that respect: Developers actively push their commit after a code review has been completed.
Facebook have just flipped to Mercurial (from Subversion). In the article linked to at the top of the page, Facebook have not mentioned “pull request” or “patch queue”, or indeed “code review”. The article was mostly about speed, robustness and scale. -I suspect they are sitting within the semantics of Mercurials patch-queue processing though, although assigning a bot to it rather than a human.- Update: Simon Stewart pinged me and reminded me that they use (and made) Phabricator. He spoke about it in a Mobile@Scale presentation, and that video is here. In the video he says the review is queue based now, but that they experimenting with landing the change sets into the master now. The video is from November, and was for the Android + iOS platforms, but it is likely to be used today for the main trunk for the PHP web-app.
Automated Testing
TL;DR: same
Heavy reliance on unit tests (not necessarily made in a TDD style). Later in an build pipeline, Selenium2 tests (for web-apps at least) kick in to guard the functional quality of deployed app.
Manual QA
TL;DR: mostly the same
Both companies have progressively moved way from manual QA and dedicated testing professionals, towards developers testing their own stuff at discrete moments (note the Dog-food item above too).
Prod Release Frequency
TL;DR: it varies.
Facebook for the main web app, are twice a day presently (at least on weekdays). I published info on that at the start of last year. Google have many apps with different release schedules, and some are “many times a day”, while others are “planned releases every few weeks”. Many are in between.
Prod DB deployment
TL;DR: mostly the same
Database (or equivalent) table shapes (or equivalent) are designed to be forwards/backwards compatible as far as possible.
Pull Requests as part of Workflow
TL;DR: same
Etsy, Github, and other high throughput organizations are trunking by some definition, but using pull-requests to merge in things being done. It has different obligations if done, but Google and Facebook are not doing this in their trunks - they both essentially push (after review). Refer the ‘Code Review’ section above.
Common Code Ownership
TL;DR: The same
You can commit to any part of the source tree, provided it passed a fair code review. Notional owners of directories within the source tree take a boy-scout pledge to do their best with unsolicited incoming change-lists. There are strong permissions in the Google Perforce implementation, but the pledge means that contributions are not often rejected if the merit is there.
Build Is Ever Broken
TL;DR: The same
Almost Never.
Directionality of merge for prod bug fixes
TL;DR: The same
Trunk receives the defect fix, it gets cherry picked to the release branch. The release branch might have been made from a tag, if it didn’t exist before.
Binary Dependencies
TL;DR: The same
Checked into Source-Control without version suffixing (harmonized versions across all apps). E.g. - log4j.jar rather than log4j-1.2.8.jar.