Paul Hammant's Blog: TCKs and Servirtium
I’ve been building an open source service virtualization technology over a couple of years called “Servirtium”. It is out on GitHub now and utilized in a FitTech startup I’m involved with.
Here’s a diagram (pls excuse the inline SVG):
Here a dev team has a unit test (say JUnit for Java) that calls something over the wire for some purpose. The team notes that slow and now at all consistent. They suspect that the “sandbox” service they are POST/GETing to isn’t really like the production version of the same service and the fear they’re sharing the same soapbox env with many other companies potentially affecting the underlying data and certainly the performance. The icing on the cake is that they suspect the vendor/other team is also doing their own dev/integration/QA testing in there, and that also affects the experience. For many decades we’re just had to put up with that.
There a field of computer science called Service Virtualization that makes this easier in the last few years. There are commercial and open source versions of this. I have a rule though. If you’re employing Service Virtualization for test automation: no TCP/IP leaves the machine during tests. Mountebank by Brandon Byers of ThoughtWorks and the more established WireMock adhere to that rule.
I’ve made one too though - I wanted the recorded conversation (it records) to be in Markdown, and formatted in such a way that they look pretty on GitHub given it automatically transforms Markdown into HTML in the web interface. More later on that.
Let’s consider test recording:
Here, as a man in the middle, Servirtium is recording the HTTP conversation for the duration of the test and storing that in markdown. You would co-locate this with the tests in source-control and check it in. Perhaps after some sanitization.
Then in another mode of operation, playback:
This time the remote service is not used at all - just the played back conversation/interactions from markdown in source control.
Your test suite would need three modes of operation:
- direct
- recording mode
- playback mode
You’d need a switch in the test code to be able to run a suite of tests for one of these three modes. Your aim is all three alternates will pass. Maven profiles or -D properties is a crude way of passing that intent into a build.
Your main build run automatically on checkin would run against the played back markdown. The Technology Compatibility Kit stuff that I promote from time to time would require less frequently attempt to record again and compare the differences between the current recording and the previous one. For the startup, we’re re-recording every Monday AM as there are some dates in the vendor’s sandbox service that appear to move forward on a weekly basis. We’ve a different build-automation job for this. We could make that build job autocommit the changes to dates, but we’re doing a human verification of the diffs right now.
That checking of diffs is the key tie-back to the Technology Compatibility Kit (TCK) elevation of the Service Virtualization (SV) idea.
People think CI is build automation by a daemon, but it’s not. It is a human practice about integrating all the work that’s happening in parallel and testing that it’s good. That does not need to be done with a daemon service at all. That aside, you’ll not that the use of playback in a TCK idea is the opposite of integration as an activity. Indeed it feels like delayed integration. If you could avoid doing it you would but that other team is outside your control and this feels like a decent buffer against their slowness and unreliability. A buffer with a safety net that allows you to move fast but not get out of step long term.
A lot of what we have been doing since Kent Beck et al got busy defining eXtreme Programming 20 years ago is shift left. Barry Boehm did the the statistical science (also 20 years ago) on “quality” and started the visuals around the leftwards shift needed my ref. Brian Merrick, perhaps, was the figurehead of pushing the broader dev community into non-unit test automation. The visuals has gantt-chart or timeline connotations perhaps.
More and more out industry pushes activities leftwards in terms of sequencing and automates. However this TCK concept with the reliance on playback for developers minute to minute and the pushing the re-recording to less frequently is a spoonful of shift right in my opinion. It balances out as I see it. A balance of speed and alleged consistency to the left, with formally correct in the “shift right” place. Still automated though so massively left of yesteryear’s lengthy release certification realities. Yesteryear’s realities were three planned releases a year, each followed by one or two bug-fix releases. See my cycnical bonus calculation spreadsheet for extra fun.
Servirtium on Github: https://github.com/paul-hammant/servirtium. Here’s an example of a recorded set of collaborator interactions. Albeit 207 of them which is wat too many to be a “good” test.
Note that it’s for test suites written in Java, Groovy, Kotlin presently. I’m interested in help porting this to Ruby, Python, C#, Go, and Node.Js. Maybe just Rust with bindings to all of those.
And lastly, yes I’m playing off Selenium’s name here, but then I’m co-creator of Selenium V1.0 so I don’t feel like too much of a leech.