Paul Hammant's Blog: Smalltalk Envy
I myself was not on the famous C3 (Chrysler Comprehensive Compensation System) project in 1996-2000 that is pivotal to ‘Extreme Programming’ (XP). That project used Smalltalk, and team contribution was provided by Object Technology International’s ENVY version control system. I wanted to know more about that so I asked people that knew something about ENVY back then.
I was particularly interested in what it was like working with ENVY hour by hour, and I’m an amateur historian who likes to see things written up for posterity that need to be written up. Thus I asked the people I know about the early days of Extreme Programming, and ENVY usage, and have modestly reformatted conversations for posterity below.
Dave Thomas - founded the fabled OTI (Object Technology International)
Dave’s OTI was notable for Smalltalk, ENVY/Developer, Visual Age for Java, and (as it was absorbed into IBM) Eclipse. Dave himself is a restless and driven businessman and innovator. His impact on the field of software development is much bigger than you realize, unless you already knew everything on his personal website. Dave’s fact packed contribution to this posting:
The original system was called Orwell ref and commission by Brian Barry of DREO Canada in order to bring disciplined development to a team of developers. In those days version and configuration management were foreign to personal image centric development style of Smalltalk. It managed both source and binary as well as versions of methods, classes, applications and their exact configurations for different products/releases. Developers managed their own change logs and code could only be shared by exporting and importing source. The original research system use a btree for method storage. The first system designed for developing embedded systems with code placed in ROM had an overly strict concept of code ownership.
Method, class level versioning were new concepts at the time, as were applications (now called packages and components) and there is still no capability in existing systems for sub-applications, method extensions and and configurations which are first class in the programming environment. ENVY/Developer was an internal tool that escaped because there was a need for something like it. Unfortunately the UI and documentation often meant that unless you learned it from an master (Alan and the ENVY/Master book) many didn’t grok the workflow. ENVY was a great strong specific solution which meant it worked well if you used it was intended. It was constrained to run on very small machines and poor network file systems where storage was expensive. We didn’t share the global name space due to a rush in implementation, which would have allowed ENVY to support true distributed development including global definition and initialization.
The commercial implementation done at OTI was called ENVY/Developer. Since I hate acronyms, we named products and projects with names. ENVY was inspired for software environment. ENVY/Developer was the version and configuration management tool, ENVY/Smalltalk was the embedded Smalltalk, ENVY/Actra the Smalltalk with Actor, ENVY/Packager the tool that removed unnecessary code from the image (both methods and class slots) so the code could be placed into RAM/ROM/Flash as appropriate. Later ENVY/App provide a “applet” like package for the Momenta pen computer.
The commercial implementation uses an immutable log for changes, and semantic locks to implement efficient fine grained concurrency. ENVY/Developer added the concepts of Configurations (Config Maps) and Sub-Applications. The former allowed a complete image to be loaded quickly for a specific product/Application configuration. Config maps could be nested for complex product development. The ability to click and load a specific configuration enabled one to support a specific customer in the field.
Code ownership was supported for both classes and applications, with a mechanism which allowed others to assume ownership when needed. In practice many teams used a common owner for applications. Applications were one of the first uses of Packages. An Application when loaded had the option of initializing itself and other applications. Applications were intended to be used as components with value based interface methods. SubApplications and Class extensions enable multiplatform or specialized method variants to be loaded for execution in different applications/platforms. One could version a method, a class, a SubApplication or Applications. Clients used the product to manage documents, hardware tests and deign diagrams.
Visual Age Java was implemented in Smalltalk and the Team facility used ENVY/Developer under the hood. The move to Java in Eclipse removed ENVY/Developer; but the ideas were contributed to Delta V (the basis for WebDAV/Subversion). ENVY/App provided the facility to load and unload applets, which at the time was not available for Java. Hence OTI contributed to the core of OSGi.
Many of the ideas were similar to GIT but with an integrate and share first culture versions a fork and merge later culture. ENVY encourage fine grain sharing to reduce merging. Being able to have multiple developers working on classes and methods with few conflicts facilitated this culture. Later versions included diff and merge tools. Recently Analyst for Kx used a new implementation of the ENVY repository, which has now uses a Git backend. The Git model is better for distributed development, and ENVY adds the concepts of function and module level versioning and configurations to the Git model.
Ron Jeffries - C3 veteran and industry luminary
Ward suggests that Smalltalk’s way of operating encourages one to think of oneself as operating on a private branch. I’ve found that in my own early Smalltalk work, which was solo or with maybe one other person, we felt great resistance to synchronizing with other versions, but those were more like releases from the Smalltalk provider or some library provider. We were, in essence, working on a very long-lived branch, perhaps nearly a forever-branch.
C3 didn’t work like that at all. On the contrary, we developed a style where no branch lasted more than a day, and usually only half a day. This came about, I think, due to our direct observation of something Kent Beck taught us, which was that if something hurts, we should do it more often.
If you wait a week or a month to integrate outside changes, it’s quite difficult, tedious, slow, and error-prone. When you integrate after a few hours, it goes much more easily.
Part of what one does is like today’s “pull request”, where someone else has completed something and asks that everyone take it into their image. Since we worked in a room, this was done by a pair calling out that they had updated something and would everyone grab it. Each other pair would wait until their tests were green — usually, only a matter of minutes, when you’re proceeding in very small steps — then pull in the new release of code and tests. They’d check for any collisions: the Smalltalk browsers would display any differences. Then they’d run all the tests, a matter of minutes, and make sure everything was still fine. In the rare instance where it wasn’t, they’d work with the releasing pair to find and fix the problem.
When you reason about this kind of approach, frequent integration becomes obviously the right thing to do. We were working with around 3,000 production classes, and 30,000 methods. If you and I work separately for an hour, the odds are that we’ve not touched any of the same code, and since we’re both keeping all our tests running all the time, I can pull in your code with near-certain impunity.
On the contrary, if you and I work separately for days on the same application, we’ll almost certainly interfere with code the other has touched, and our tests, while running green on both branches, will be likely to deviate. Then, when we try to integrate, we have a near certainty of a collision and a difficult session of integration and debugging.
I note that you refer to the need for rigor. We didn’t have that experience at all. We were working on user stories that were sliced down to require a day or two of work. We cycled between writing a test and making all the tests green again in a few minutes. (We often felt that if something took more than 20 minutes, we should just roll back and start over, and we often did just that.) When we changed anything interesting, we’d get people to pull it in. When we had a story working, which most pairs did once or twice a day, we would integrate it into the trunk image, and ring our little bell. This was the signal that there was good stuff out there. (A goal-line victory dance was optional.) People would integrate that code by saving their work, loading the new image, and checking in their work.
It didn’t seem to require discipline or rigor at all. When you kept a branch open a long time, the pain of re-integrating was usually enough to teach you never to do that again. If that didn’t do, a dozen other people reminding you sufficed.
This experience was enough to teach me, and I think everyone on the team, that very short-lived branches and merging back to trunk is so obviously the way to go that we’d try hard to resist any other approach.
Ward Cunningham - OO pioneer and industry luminary
Ward Cunningham used Envy for “The WyCash Portfolio Management System” earlier in the 90’s and an Agile Manifesto signatory. He talked about the project at OOPSLA in ‘92. This was before the C3 project kicked off, and Ward himself was not the project, but his commentary is key because by many accounts he is the grand pappy of modern programming. He says:
We faced this problem when we advocated for collective ownership, a natural consequence of pair-programming, but considered irresponsible at the time.
Refer to the Portland Pattern Repository’s IntegrationMachine entry.
Smalltalk kept all source in a machine image and encouraged developers to read and refactor any part of it. The image could be seen as a personal branch on a personal computer but our work required a collective approach suitable to common goals. Edits were kept in a change log which could be “filed out” of one branch and into another. This included capturing operations like renaming or removing classes and methods. I wrote a tool called the Change Sorter that allowed selective file out for merge. It has been cloned many times and workflows built around it.
Refer to the Smallest Federated Wiki (SFW) entries on episodes-pattern-language, work-product, work-integration, developmental-build, and build-repository. (use arrow keys to navigate)
Note that we were methodical enough to read any database variation back to our own alpha releases and progressing through multiple code and data reorganizations. This lets us distribute a field patch to a customer that would update their database and then have that smoothly integrate into our next release where DB migrations would sort themselves into the proper order.
OTI and Connextra (London)
Steve Freeman, Rachel Davies, Tim Mackinnon, Ivan Moore and Duncan Pierce were pioneers of Extreme Programming in London in the late 90’s, and friends/former colleagues of mine. Between them, they were at two companies pertinent to this article: The Object Technology International (OTI), and Connextra.
OTI made ENVY of course, but also The Smalltalk environment that used it, then folded into IBM to make Visual Age for Java (still written in Smalltalk) before Eclipse was started (leveraging all of that experience building IDEs).
Connextra was a startup making software for the nascent advertising industry. It was famous for doing XP faithfully with the whole company full backing the methodology. The nature of the development was a streamlined consecutive development of consecutive releases - in other words, nothing from iteration N+1 started until iteration N was completed and deployed, with the cut over being a single day as far as the developers were concerned.
OTI’s Envy which, like many things in Smalltalk, has not been matched by more recent tools. Every method save would create a new version of that method which was immediately available to everyone on the team. Tags were a slice through class and method versions. It had a lot of influence on the way the Smalltalk community viewed team working which, of course, fed into XP via C3.
I think there’s a level below Envy that we should mention, which is the Smalltalk (via Lisp?) tradition of image-based development where there’s always a live environment. Envy added a sense of a live repository of code which, if you squinted, extended the concept of liveness to the team.
This made later XP practices, such as shared ownership of code, easy to implement, although at the time we had personal ownership of features which avoided merge clashes.
The comparison with XP is interesting. For example, there was a clear understanding of going slower to go faster. Introducing late changes to release took a lot of agreement because we didn’t want to break tested code–this was unusual at the time. On the other hand, our group did not have anything like CI, so occasional solo refactorings would break the world.
I never used Smalltalk with ENVY, I came straight to VisualAge for Java, which was written in Smalltalk and used ENVY underneath, while at Connextra. I had previously been working in C/C++ with RCS and ClearCase so no basis for comparison.
At Connextra we used a physical Integration Machine that pairs used to run all JUnit tests and then commit changes to the main branch, a pair would typically do this 5-6 times during the day.
We had a toy cow that we mooed on top of this to alert other pairs to pull in latest changes. I still have a few photos of these things.
Stories broken into tasks - their size was between 30 mins to 2 hours work for a dev-pair to take from “ready” to “done”. An average of one hour perhaps. This is about the same as we do nowadays, as it happens.
At Connextra we ran Python smoke tests on deploy and also did some manual checks in the live environment to see things looked alright. I think we had a few FiT tests but not many. We had automated Nagios checks with a quiet background heartbeat sound for health and vocal alerts for servers that were not responding.
We used VA-Java soon after I head Ward talk about XP at OOPSLA (and before the white book), and as I had used Envy and Smalltalk at OTI, we were able to emulate those similar practices in our own style (see below) at Connextra.
To add to what Rachel has mentioned - VA Smalltalk actually used Envy underneath - it was its own Unix-based file structure (so not CVS) - it had a small process that ran on a server to manage record locking, which was actually conceptually simple as it worked a bit like BigQuery does - it was an append only record system (so no deletes) and it worked at the method granularity which greatly reduced merge conflicts.There was a concept of config maps that pointed to all the latest “versions” of things (i.e. a change set)
VA-Java was very much like using Smalltalk and Envy. Conceptually an open “Config Map” would be like a git master, pairs would check out that map and could version “local” changes eventually pushing them to master (typically hourly/daily). There was a way to run a formal branch in that model - but it was a bit awkward and we tended to avoid it. Things are a bit more automated in today’s world. As there was no auto-merge facility in Envy - just good manual diff tools, a physical integration machine was used. It sounds primitive - but actually was surprisingly effective.
On one hand, method level versioning (and class and package versioning) and the need to collaborate to avoid large merge conflicts actually meant that they were quite rare. This model also encouraged small changes and frequent “pulls” from master to avoid pain. Also, the social aspect of getting up to do a release (and seeing others do it) had an equally powerful effect of encouraging productivity and conversation. This is something that struck an interesting chord when we talked about it at an XPday several years ago. All this said - pre-Connextra when working at OTI in Envy-Smalltalk, lots of offices worked remotely and as such there wasn’t an integration machine - and it was generally a role of a person to put together releases. I don’t recall it being that painful back then using that approach either. However I fondly remember the integration machine from Connextra with that moo’ing cow.
Our stories were in the half a day to 2 days size - although I think we were still slightly in the task vs. Story mindset originally proposed in XP (so some of our stories were a bit task’y in retrospect - although I don’t think that was particularly problematic in a team that was good at collaborating). I feel like the way people work now on stories was much like how we worked back then - possibly we discussed the details a bit more (not sure whether that was better or not).
Our testing was more at the unit level - although we did have higher level smoke tests and we also parsed error logs etc.
I wrote a tool that took the versioned classes in your private branch (actually - it wasn’t a branch per say, it was just what was in your image) and created a changeset - that changeset file we used on the integration machine to bring the relevant class/package versions into the official master branch where we then did a final commit at the config map level (equivalent to a label in git is my best approximation). Envy did have some tools for doing this - but we found them a bit cumbersome and heavyweight - and this light weight change set mechanism seemed to work for us.
There used to be a great video that showed how Envy developer worked - I did a quick search online but I can’t find it. I wonder if that might help you understand what it was like (maybe someone has a copy somewhere?)
As Steve mentioned to you. - every method was automatically versioned when you saved it (so you never lost work - this was a bit revolutionary in its time, that you could even contemplate doing that. Even today I think people could use infinite undo). You could then version classes (like we do now), you could also version Packages (like we do now) - however, you could also specify what dependencies a package had and the compatible versions for those dependencies (e.g. Apache 2.0 and greater). There was a really neat sub-aspect of this in that dependencies could be hardware dependent - so you could pull in different code if you were on windows vs. Linux using the same v2.0 package). Finally, there were config maps which assembled packages together to form complete solutions (or applications). There were also pre-post hooks on load for all of these so you could do clever stuff… In a way it was a precursor to build pipelines, as the OTI guys did some clever things with those hooks.
There was a real engineering discipline to the whole thing that the video I’m thinking of really teased out in 199x - I think it possibly gives some of the insight into the thinking and rigour of XP (as I bet Kent saw that video)
ENVY was a big atomic write only database (using appends). To release a change to multiple classes you were effectively storing the active versions of those classes in a parent (in Java, that is a package) - in the UI you would select those classes and press release (which in code wrapped this in a transaction and did an atomic write in the package). So - I think you may have been thinking about something else Ivan.
For #2, I think you are correct - although I think the flaw was less in Envy, but more in the versioning UI tools - as the data would all be there in Envy - the current and previous versions plus your version. I think it wasn’t such a big deal, and so no-one ever did anything about it.
What wasn’t available, was automatic merging - you used the diff tools and stepped through your changes - accepting left and right as appropriate. It sounds tedious, however, I think many an error or improvement was discovered when doing that (v.s a magic auto-merge). I think the discipline of reviewing those changes was very useful - however this where I think Ivan is thinking about atomicity - as while doing that review (normally with changes that took a few minutes) there was a danger that someone could release underneath you - meaning you ended up create what was called a “scratch version” - a version that wasn’t released into its parent (as the parent now had a new version). This is why we serialised things at a release machine. It also had a nice side effect that you were aware of what changes were going into master - and often that realisation sent you back to your desk to rework something to make it fit better.
I think teams today use a red/green build screen as that token - with master broken more often than most people admit.
The VisualAge for Java experience its integrated ENVY was very similar to the Smalltalk experience with ENVY. We lived the no branching ideals of Continuous Integration.
I just can’t remember the average story size for a development pair was. I’m pretty sure that Tim is right, that we split stories up into technical tasks, each of which would be quite small - I think anything from a couple of hours to a couple of days, probably with an average of something like half a day. I don’t remember how large stories (rather than tasks) were. I think Rachel’s “between 30min to 2 hours work” could be right for some of the technical tasks but I don’t remember well. I’d be impressed with our previous selves if many were that small - I think they were probably a bit larger - but nevertheless our tasks were smaller than most team’s stories that I see these days.
We had lots of fast running JUnit tests. We had very few non-unit tests. Overall the balance was much more heavily unit test than most teams I see these days. It worked OK for us. We had a very quick build but did occasionally (not often) miss problems.
Tim produced a very innovative system for doing visual testing by starting up VMWare (or similar) images with different OS/browser combinations, then saving a screenshot on a shared file system, so you could see how our stuff looked on different platforms and whether it changed over time. Not automated image comparison (not feasible back then) but automated capture for later human comparison.
As far as I remember there were at least two limitations of ENVY that were partially relevant to the benefits of having a release/integration machine.
- ENVY didn’t have the equivalent of atomic commits to multiple files.
- ENVY didn’t do three-way comparisons - only two way. So you couldn’t tell the difference between you adding something vs someone else having deleted something (or vice-versa).
We did split stories (white cards, running left-to-right across the board) into tasks (blue cards, top-to-bottom, under the corresponding story), and generally focused on completing tasks. We had a strong focus on completing things, and, in my memory at least, were better at completing everything required for a story without creating loose ends than probably any other team I’ve worked with subsequently. The same was true for integrating. I haven’t worked anywhere else where there was a “release cow” used to signal the need for all pairs to catch up at their earliest convenience. Consequently, although we took on some fairly large stories in our three-week iterations, I think we were probably capable of releasing immediately almost all the time. I recall that we paid attention to maintaining a working baseline at all times, and staged our tasks so that the baseline was (almost) never broken. It was incredibly rare to catch up with changes from version control and find yourself broken.
We did not use branches, although I can’t remember how we recorded the results of spikes where code was produced - particularly code written for Gold Cards. It’s possible we had some little 1 day branches to hold code of that sort. I don’t remember ENVY being suitable for creation of branches, and I don’t think we were particularly tempted to use branches, as they were rather antithetical to our style of continuous integration.
Tasks were generally small (certainly smaller than other teams’ stories) and a pair would get through several a day. This was really our unit of work, and typically we committed changes at roughly task level. Stories varied widely in size and I can remember larger ones having almost a whole board’s height of blue task cards (~10-12?) under them, although 1-4 is more typical in my memory. Extrapolating, I would guess a large story would take around 3-5 pair days, a small one around a half day, and a typical one 1-2 days. We spent a long time (too long!) on planning because we invested quite a bit of time in the task breakdown before estimating. Looking back and comparing to my experience on other teams, I think we benefited from this during planning (better estimates), design (involving whole team in given story and expressing design decisions through tasks) and execution (less loose ends missed -> less “bugs” / “follow up” stories -> less churn). I’m unsure if the time invested was worth it, though. I still try to follow the small tasks, lots of commits, catch up often, never break the baseline philosophy and I find it straightforward to superimpose that on the modern conception of a story being for the pair to work on autonomously. At Connextra, personal/pair process was unusually well aligned with team process. I haven’t worked anywhere else where process was so homogeneous across the team. I recently started working with Paolo again after 15 years and was surprised to find we still both work in almost exactly the same way despite all the things we’ve learned in the meantime.
I believe the subsequent trend to cut story sizes to fit smaller iterations (and thus stories tend towards what we used tasks for) promotes loose ends (i.e. overall business-level story doesn’t hang together even though dev team thinks they’re done) and sometimes artificially small stories that don’t really have any value by themselves (i.e. what we called tasks). I cannot remember how good our stories really were, though, and I can remember some instances of trying to slice them too fine.
Testing was (during my time) almost exclusively unit testing with very high coverage, with logic tested as exhaustively as we could manage. Since Connextra I have tended to follow a lighter testing style, but still strongly prefer unit tests over integration tests, and I tend to prefer technologies that support unit testing, or at least don’t force outside-the-box testing. FIT was after my time. We did do some testing of our generated HTML pages using JTidy feeding an XML parser and then asserting on the resulting “DOM” which was probably as close as we could get to the Selenium/WebDriver style of testing back then. Python was mostly after my time, and was mainly Ivan’s fault, if I recall correctly. ;-)
And yes, Tim’s VMWare automation was very cool and well ahead of its time.
In respect of pair rotation - I don’t remember it being particularly unusual for us to rotate a pair on an already-started story, and we did occasionally rotate a second time. Our policy was to maximise information diffusion, so on day 3, neither of the original devs would be on the story. Two rotations was pretty rare, though. It think that was generally for things that were technically nasty (e.g. Paolo vs Apartment Threads) rather than business requirements that were intrinsically “big”.
We didn’t have a build server - that was the release/integration machine. We used it partly as a token to prevent commits to trunk being made in parallel. This was partly to compensate for limitations in ENVY. There was a process checklist on the wall above the machine. You would bring your changes to the machine in the form of (IIRC) a list of RCS-like versions of individual methods, then review and merge them with latest. Then run all tests, then publish, then turn over the release cow to signal others to catch up. The release machine acted a bit like a modern pre-commit hook that ran the tests before allowing the commit.
I’ve never really been keen on build servers as ways to check commits, and I particularly dislike hearing them referred to as “continuous integration” servers. CI is a human process to promote collaboration and team working. So I gently take issue with “you are going to need a hook up a build server to verify their commits” - if they have committed, it’s too late! Build servers do have value (IMO) as creators of golden build artifacts, though. I don’t object to them - only to people thinking / promoting setting up a build bot means their team is now doing CI.
One small point is that integrating often does not by itself imply that the code is releasable on demand unless your unit of commit is a story (and I believe it shouldn’t always be), because if finished tasks that do not add up to a complete story are present in the code, it may be visible in the released-on-demand software, which may be undesirable for the business. We tackled this at Connextra by trying to complete all the “enabling” tasks first, then doing the final task that wires it all together in a visible form. We didn’t use feature flags. I guess you could say we tried to make committing the final task the “feature flag”. Probably not always as successfully as I now remember. I used to call this style “back-to-front” (i.e. back end first, front end last).
On ENVY specifically
I inferred that the rate-limiting aspect of the Integration Machine was because of a limitation in ENVY, from the way we used it. The others’ knowledge of ENVY is greater than mine, and I think the Release Machine was already in place when I joined so it might be that I haven’t fully understood. My memory is that ENVY had versions of fine-grained elements of the source (like Tim wrote) and you could have a “current version” of the whole source composed of the current versions of all those bits. But you could go back as well as forwards with your versions of a particular bit (kind of like StarTeam but with redeeming features). This gave the possibility of inadvertently undoing someone else’s changes in a merge. Serialising through the Release Machine made that very unlikely. To be fair, it was quite easy to recover from, but perhaps not as easy to notice.
I don’t think we actually intended rate limiting, but we did occasionally get a small queue at the Release Machine. That may have been one of the reasons for the introduction of the Release Cow as the team grew. (I think there were only 3 devs when I joined).
ENVY did make it possible to “go to the token” to commit rather than taking the token to the pair station since the micro-versions of the bits were everywhere. I haven’t seen this model in use anywhere else.