Update: See the new resource site for Trunk-Based Development called, err, TrunkBasedDevelopment.com and make sure to tell your colleagues about it and this high-throughput branching model.

This is a writeup of the Microsoft Office team’s source control usage, and how close it is to an ideal Trunk-Based Development (TBD) model.

The summary, for those with ADHD, is that the office team does TBD, but imperfectly. It’s not new for them, it’s the way they’ve done it for a decade.

We know of hundred of software development teams that are doing trunk model that are sized 10 to 100 committers, but only a few with more than 100 developers, and only a couple that have more than 1000. Until now. Microsoft’s office team joins Google and Facebook as being known ‘TBD at scale’ adherents.

How many committers to one trunk?

Microsoft have around eight thousand people in the Office related subdivision. Around three thousand of those are actual developers, but the other 5/8 of the Office organization could also commit if they wanted to - it does not matter whether they are programme manager, project managers, Product Owners, execs (and similar) … they all can commit. Marketing people (and similar) are outside the Office team and can’t commit. The developers statistically eclipse all others though, in numbers of commits to product code. Testers own UI automation and test code that gets run in the labs (see later) and on some teams they may check-in to the test code much more frequently than developers check-in to the product code.

There are often 1500 commits or so a day, but presumably that can range up or down depending on how busy the week is.

This has been the model for a decade or so now.

Where are these committers?

Short answer: worldwide. In fact developers are spread across seven timezones (UTC-8 to UTC+8) for major offices, and maybe more for remote workers and non-major dev centers.

Word though, is pretty much wholly developed in Redmond. By contrast, Sharepoint has more than 50% of its developers outside Redmond.

The technology in question?

SourceDepot (Microsoft’s licensed recompilation of Perforce) is the technology in question, not TFS. I cannot determine whether there is a plan to migrate to TFS at any point.

Developers ‘enlist’ in business-name subsets of the trunk. Projects by another name. That maps to the globbing include/exclude capability of client-specs in Perforce. I’d noted that Googlers subset their mega-trunk previously, and Microsoft are the same. To recap - Googlers subset down to “Adsense” or “Gmail” (or “Adsense AND Gmail” if that makes sense for one dev for one day). Microsoft, in this Office repository, do the same but the repo contains only Office related products (no Windows, no MS Paint etc).

Workstations are not big enough to check out everything in the trunk (same as Google). Therefore enlisting your pertinent subsets is necessary. The for that constraints are build times and physical hard drive space. The whole mechanics of enlisting in one or more projects is viewed very positively with Microsoft and has been around for as long as SourceDepot has been configured as the source-control system for Office.

Code Review

The Office team has a code review tool very similar to Google Mondrian, and definitely configured for the same Continuous Review style/rules. Specifically things are reviewed before commit to the trunk, and there’s a hierarchy of approvers for that. It has per-line comments, per file comments and per-commit comments .. pretty much everything that Mondrian has/had.

Related to code review through a portal, there’s a way for developers to seek a change-list from a colleague and apply it to their own machine for review.

Binary Dependencies

If items are depended on from outside the source tree, they are checked in to the trunk as binaries. There’s a lockstep upgrade agenda for those too. If you are embarking on an upgrade of, say, unicode_baseline.dll (contrived) you’ll make a change-list as big as needed to make that upgrade work for everyone. Google were the same.

How strict is the adherence to TBD?

The Office group doesn’t follow TBD 100% of the time. One large divergence is that there is one trunk per major release of Office, and it is only within that release that TBD occurs. See here:

Not shown in the above diagram - anything real about dates in any way, although you can take away that each trunk would (or should) last ten years which matches the support periods. Also not shows which release pertains to alpha, beta, Release Candidates, Customer Tech Preview, Service Pack in a timeline.

For teams focussing on more than bug-fixes that affect more than one major release, bouncing between the Office 2013 trunk and current trunk is an ongoing headache. That headache is temporary though, while Office re-tunes itself for the new release cycle, but it is still a day-to-day reality that moved them away from the smoothness of TBD somewhat.

Branch by Abstraction & Toggles.

While I have a bias towards Branch by Abstraction, which the Office team uses on occasion, more often than not “most pragmatic” guides migration from old to new components/code. Such changes are not going to be on a new branch by reflex, they’ll happen in the trunk by some pragmatic means, perhaps with a build switch to activate them conditionally.

Not making long-running feature branches

There are occasional product or feature branches though. They are rare though - doing so is not regular way of working for the larger team to do this, and it obviously conflicts with the ideals of TBD. I don’t know any of the real cases behind these branches when they have happened.

Keeping the trunk green

The whole organization is in charge of such policing. There is also a dedicated build lab which is in charge of making sure individual checkpoints are high enough quality that when each development team syncs forward to the checkpoint their products will mostly work. Individual changes that break things are rolled back as part of their policing. It is not HEAD that arbitrary developers sync to a few times a day, it’s the latest passing checkpoint.

They have other concepts that allow them to have gray areas. For example, if one part of the code base has a low pass rate in the checkpoint automation, it may simply get locked to future check-ins without any changes being backed out, to allow the checkpoint to proceed. This process of locking is owned mostly by the individual product teams, not the build lab, with resulting ownership being a bit fuzzy.

Always Refactoring?

There’s no answer for this that is for all of ‘Office’. They have teams developing in Objective-C in Xcode. Others in C# in Visual Studio, Some in C++ in Emacs. Some of those tools don’t have refactoring capability. Only VisualStudio with built-in refactorings and JetBrains’ ReSharper tool have decent refactoring capability. At least, in Windows-land.

What the more than one trunk means for merging.

If a defect is fixed in, say, the 2013 trunk, and is felt appropriate to go out in a 2010 service pack too, it would be merged to the 2010 trunk, then out to the branch that’s for the next service pack. Other than the fact that it’s two merges, that’s identical to the way you’d do it for a classic trunk design - complete with cherry picks - even if that is for a list of commits. Here’s what that looks like, with the merges numbered as described above:

What Perforce/SourceDepot makes easy is mappings for appropriate merges: In the config grammar of ‘client specs’ //depot/office2010/trunk/** can be mapped to //depot/office2013/trunk/**. This would aid merging of cherry picked bug fixes from the latter (where everyone is more active) to the former for the sake of a bug fix or service pack release. It would also be used for the original creation of the new 2013 trunk.

How many platforms do you target from one trunk?

It is unclear (to me at least) how many of the diverse platforms are targeted from one trunk. Certainly x86 and x64 are made from the same source, but is Office for WinRT (ARM) too? Perhaps yes - Building Office for Windows RT on blogs.office.com suggests a delivery goal from the outset was:

“ARM as a ‘first class’ platform, including the Same look and feel as x86/x64, same level of polish and reliability, full Office feature-set and fidelity and Service parity (e.g., save to SkyDrive, roaming settings, other Windows Live integration, Office.com experience, etc.)”

To me, that means the same codebase. Not sure about Mac and iOS apps though. Certainly there are recent questions about Microsoft’s directions for Office - though in that one it does not get technically nuanced, and we can’t really infer source-control decisions.

Continuous Integration Tooling

Microsoft have lots of tools that kick off against groups of commits into the trunk. Those include automated test suites. They have a concept of “checkpoints”, which are labels that represent a specific build number which has been through their build lab, verified to meet a minimum quality bar through automated testing. If that process finds defects, individual checkins that caused the defects are identified and backed out. Their development team is too large to do this on every commit, but the checkpoints are analogous.

There is a lot of leeway for individual teams to do unit testing how they like, and use as much of the common infrastructure as they like. As per best practice, developers runs sets of tests before committing code, and the CI daemons rerun those automatically after the commit. The Office-wide common infrastructure does automated UI testing too, if that wasn’t clear before.

The infrastructure for all this tooling and automation is huge, but I’ve not been able to determine any numbers. I also don’t know the names/design of the technologies doing the automaated pieces.

I’ll take a wild guess that the test pyramid is roughly followed, but with variance for teams.

Popularity of Trunk-Based Development

Developers in the Office super-team don’t discuss it so much. There are aspects of their process that receive much more discussion and since this aspect works well it is completely off developers radar. Therefore the popularity of TBD doesn’t get discussed too much.

At a day to day level, in line with other companies doing TBD, there’s a wish for local branches on a developer’s C: drive. Wishing to pull changes from a colleagues un-pushed version is a separate thing, and has much more serious impact on a TBD workflow. It does occasionally get discussed at an intellectual level, but nobody has made steps to create tooling for it.

Office team methodologies

Again, there’s as much variation within Office as there is within the software industry. Their wish for faster release cycles mean a push towards adopting more agile techniques on the whole. It’s small-a agile though with no actual methodology the Agile industry would recognize being in-force for teams. The need for pragmatism in a very large codebase drives the need to pull back from dogmatic methodology choices.

Things I don’t know, but wish I did

  1. The economics of their way - specifically has anyone ever looked at the cost-of-change aspects of their trunk usage? Are stats available?
  2. Has an order of release ever been changed without cost impact? Steve Ballmer is rumored to have delayed the release of Office for iPad in order push the MS Surface versions first - did that have an in-team cost impact, or was it business as usual in the trunk?
  3. What the source-control landscape outside the Office team looks like
  4. Will the Office team stick with this design?
  5. What of the rest of Microsoft’s dev landscape?
  6. More details about SourceDepot vs P4 vs TFS


April 3rd, 2014