Paul Hammant's Blog: Introducing Branch By Abstraction
Update: See the new resource site for Trunk-Based Development called, err, TrunkBasedDevelopment.com and make sure to tell your colleagues about it and this high-throughput branching model.
Synopsis: Application developers should branch by an abstraction in the shared branch:
- instead of branching by Source Control
- instead of branching by #ifdef
- instead of branching by if condition
A lesser-known source-control best practice I’ve been pushing for a number of years is “Branch by Abstraction”. It is not my invention, and has been best practice for many years, but how about it is given a name. The suggestion is that you can convene large sets of developers in a single trunk (Trunk-Based Development) and avoid “short lived feature branches” that you have to merge back. The problem being with feature branches is that the current state of any one of them might be unable to be deployed for a number of weeks while the team gets it right. Those branches just end up running and running ….
Provisos
There are some general provisos for the single as opposed to composite trunk design, that coincide with hard-core Agile development:
- You’ve broken your application into multiple components.
- Each component into a directory inside the trunk (possibly hierarchical).
- Each directory its own source self-contained and its own build (possibly hierarchical).
- You have a good set of unit tests and consider them important enough to illustrate snippets of example usage.
- Continuous Integration drives things, even for hundreds of components, that drops items into a Maven-like repository. CruiseControl has a nice <httpfile/> directive that with the <include-projects/> directive that allows you to set up the killer CI installation that is branch-ready meaning you can run without a dedicated CI administrator.
- Your management are good at release planning.
- Developers are in the habit of never breaking the build :-)
Your trunk may look like:
<root>
trunk/
foo-components/
foo-api/
foo-beans/
foo-impl/
build.xml
src/
java/
test/
cruisecontrol-config-snippet.xml
remote-foo/
bar-services/
bar/
build.xml
src/
java/
test/
cruisecontrol-config-snippet.xml
bar-web-service/
So back to the problem..
What to do when (if) your team says they want need to shift from Hibernate to iBatis (hypothetical case). There could be thousands of classes that depend on Hibernate. The architects might suggest that the build will be broken for weeks so a separate branch is the best place for this change. Instead, lets try Branch by Abstraction (BBA) instead of the traditional “Branch by Source Control” ( Stacy Curl coined the name by the way - I’m trying to shame him into writing a better blog article than this one).
The steps to living Branch By Abstraction
With your most responsible developers -
- Introduce an abstraction over the core bits of the big thing you’re going to change and commit
- Update all the bits of code that were formerly using the thing directly to use it via the new abstraction and commit
- Make a second implementation of the abstraction, with unit tests that specifically test its core functionality and commit
- Update all the code from (2) to use the new implementation (still via the abstraction)* and commit
- Deprecate the first implementation (or skip to 6 if you don’t want a respectful grace period).
- Delete the first implementation (its proven there is no need for you to go back).
- Remove the abstraction (if it is inelegant).
Benefits
- Only a small team is even bothered by the change.
- You can go live at any stage - because the larger application works at all times.
- Management can be adaptive about scheduling.
- Avoids merge hell.
- Introducing Abstraction helps increase understanding/modeling of piece - which is useful in itself.
Of course, BbA is not a panacea . It is just a practice that developers/architects can often do it when architects with less nerve are suggesting yet another long running feature branch. Architects should strive to do BBA instead of new feature branches - Architects should not hope to reach a situation where they can declare at the outset that a new branch is the “only way” to achieve something.
Oh by the way, ClearCase sucks
A buddy last week was telling me of 21 significant branches, and the merge order for which being uncertain in his nameless client. That sucks. He smiled, wryly, when I guessed ClearCase as their SCM choice. ClearCase whether in dynamic, static or UCM modes has no place in Agile development efforts. It is a self fulfilling prophesy that requires dozens of administrators a few black-belt merge-meisters and multiple branches and causes long development cycles, waterfall thinking and high staff turnover. The only thing worse than it is PVCS (who owns it now?). Anyone wanting a good SCM tool for Agile development should be looking at Perforce (a favorite of mine because Intellij works very well with it) or Subversion. Subversion will overtake Perforce one year soon I guess [note: Early 2007 comment].
When do you branch then?
Real branches should be made for releases only.
<root>
trunk/
releases/
rel-1.0/
rel-1.1/
rel-1.2.x/
You may branch some days before release, then “production harden” the branch on a staging box. You’re not going to give permissions to all developers to that branch, just a couple who are ensuring its ready and handling later merges (one’s or two’s only if at all). You branch the release from trunk of course - given that CI proved that trunk was at all times pretty solid.
As well as Stacy Curl, I’m hoping Martin writes an article on this important practice. He is better with words than me.
Martin’s follow-ups after this was written: Branch By Abstraction, Feature Branch and Feature Toggles
(Main article written: April 26, 2007)
Update (May 2nd, 2009)
So the state of the art has shifted some from Subversion to Mercurial, GIT and (some burn a candle to it) Bazaar. Prior to this update, the blog article concerns trunk best practice, and is preaching to a multi-branch development team that “trunk based development” can work for you with discipline. That discipline is “Branch by Abstraction” and “little and often” commits. Of course, the team is trying to push towards Agile ad an increase in the frequency of deployments to production, with fewer defects each time. ClearCase is often where they are coming from. So I had a FX trading client in 2005 that I persuaded to move from multi-branch development to trunk-based. There was a lot of choreography to move from the entangled source tree represented in fifty branches to a trunk metaphor broken out into buildable components as outlined. It took months of course be left them with a clearer understanding of their workflow and asset control.
Later in 2005, Roy Singham (ThoughtWorks owner), made me fly in the middle of the OSCON conference (!) to CollabNet’s offices to present on Subversion and the importance of trunk based development as a way for their sales engineers to pitch in corporations who are otherwise invested in ClearCase, StarTeam, PVCS etc. The theory being that Subversion is a sellable piece in its own right, and that the larger Collab stack was not the only product/service of theirs worth talking about to clients.
Multi branch versus trunk diagrams
Here is a diagram of an often encountered development team branching choice : Multi branch. Merges are happening in multiple direction all the time. Some branches are long lived, some short. Some branches concern functional enhancements (business value) and others are for non-functional technical reasons (like a shift from RDMBS to a distributed database). Its chaos - the department pushes to production from any of the branches that allege that they need to go live and have all of the integrations needed. Here are the bad things associated with multi branch :
- For weeks at a time, an individual branch can be in an undeployable state
- the development team will report that merging is a major part of their day, and fraught with complexity given (1)
- Often there are regressions, as a merges are missed, and the business yells at the IT dept.
- Labeling and handling labels makes you consider another career
Contrast to the trunk model:
So here we see concentrated development on the trunk (actually we imply that, see the diagram below). We also see releases exclusively from release branches. We see only bug fixes on the release branches, and merges back to trunk (though we might hope that all bugs are fixed on the trunk, and merged to the release branch). We see something that is not only buildable at all times, but is also deployable from anywhere with a day’s notice. Of course you are not going to deploy from just anywhere, but imagine that as a requirement from the business - “be ready to go live within a day’s notice, and have a high level of confidence”.
A day in the life of two trunk developers..
- Checkout : 2 minutes (100BaseT network)
- Checkout : 2 minutes
- Update/Sync (speculative) : 3 seconds
- Update/Sync (speculative) : 3 seconds
- Update/Sync (speculative) : 3 seconds
- Commit : 10 seconds (10 java files)
- Update/Sync (speculative) : 6 seconds (10 java files)
- Commit : 10 seconds (6 java files)
- Update/Sync (speculative) : 3 seconds
- Update/Sync (speculative) : 3 seconds v Commit : 10 seconds (5 java files)
This is just showing regular life in the trunk, and not branch by abstraction commit by commit per se.
Moving to trunk based development on a nimble SCM
Flipping from ClearCase to a Subversion or Perforce. Some clients take a phased approach, some do a big bang. One automotive client did the latter in a lunch break. Wherever it happens, you will hear reports of increased personal productivity from 20% to 33%.
Be aware though that ClearCase requires admins. As much as one admin to twenty developers. They’ll want to put up a case for not switching SCM tools. I’m sure that Perforce and Subversion require admins. As Google apparently uses Perforce, I wonder how many of twenty thousand staff use Perforce’s ‘p4 admin’ on a daily basis. Not many I hope.
On Distributed?
Agile teams report productivity improvements over Perforce or Subversion. There’s no doubt that’s true at least today, but the Subversion team is pushing towards the features of their distributed competitors. At least, one feature at a time. I wonder though, if a team should not be adept with trunk-based development (BBA and “little & often”) before they move to distributed. That is a longer discussion perhaps.