For this article, “service” means over TCP/IP with marshaling, and “component” means same language family (hopefully via dependency injection). Conway’s Law features in this article too, as service vs component is mulled for a real case.

Google’s phone-number library

Libphonenumber is on GoogleCode (Subversion). It is a 300K library that you link into your app when you need international phone number verification and intelligence. The committers (mostly Googlers) are maintaining Java, JavaScript, C++ ports within that one repo. Others have ported what’s happening there to C#, Objective-C, Ruby, Python and PHP. The Googlers look to be pushing about two releases a month. I guess that’s changes made by telecos world-wide, as well as regulators and government bodies. There are also fixes and increased understanding of existing phone number designs that require releases.

I initially thought that the core library was written in C++ with wrappers for Java, but that’s not how it has been done. They’ve made a language-neutral data format, and have idiomatically correct libraries maintained for each language. Besides, the JavaScript deliverable can’t wrap C++ if it needs to work in the browser. Perhaps Node.js was the intended target for the JavaScript binding, making that moot.

Conway’s law

Conway’s law is mentioned because your production architecture might be a consequence of your organizational separations:

organizations which design systems … are constrained to produce designs which are copies of the communication structures of these organizations

This could have been a service. Developers using it could have invoked it RESTfully, and still consider it fast at less than 0.5ms per invocation. I’ll reuse two diagrams from my Cookie Cutter Scaling article from a few years ago…

Embracing SOA

It could be that a production architecture is designed like so:

If you were making available a libphonenumber service like that, it would be something like the “P” or “Q” nodes below “W”. Not quite as represented there or as many because it’s inherently fast (and doesn’t use a database), but still requiring a design that considers redundancy and scaling. If it is a separate node, the it could re-deploy it separately, when upgrades are available. You could mount both old and new versions of the service (notwithstanding endpoint considerations).

As it happens the libphonenumber team has made a demo web-app, which could easily serve as a RESTful service. If you’re happy to use regex to parse HTML, you could serve it as is. Adding a to-XML capability for more RESTful output is about half a days work. If you have Maven installed, here’s how to quickly play with that on your local machine:

svn co http://libphonenumber.googlecode.com/svn/tags/libphonenumber-7.0/
cd libphonenumber-7.0/
mvn gae:unpack
mvn package
mvn war:war
mvn gae:run
# point your browser to http://locahost:8080

You could deploy libphonenumber as Java process as they’ve made it, and consume it from any other process regardless of language. SOA gives you that heterogeneous possibility.

Avoiding SOA

It could be that a production architecture is designed like so:

Horizontal scaling is “cookie cutter” in this design. Libphonenumber is now available for use in every node as a component, that necessitates the same language family.

Many companies are pushing towards Continuous Delivery these days. If you’re able to deploy frequently, you would be able to keep up with the releases of libphonenumber and not sweat the fact that the whole app is being redeployed. That deployment simplicity is very attractive, especially while considering non-live environments too (particularly microcosm deployments under CI running Selenium functional tests). In production, you might even save that half-millisecond on response times, but there is a high probability that nobody notices that.

Question to the reader: What would you do for this - a service or a component? Were there any pros or cons that I didn’t list?

Updates

11th November 2014: Play with an online PHP port here (credits to Joshua Gigg).



Published

November 9th, 2014
Reads: 909

Syndicated by DZone.com
Reads:
1730 (link)

Categories

Comments formerly in Disqus, but exported and mounted statically ...


Tue, 11 Nov 2014Vanja Pejovic

I'm not sure if you were limiting the discussion to private services or not, but I'll assume you weren't. By private, I mean that it's ran by one team and only used by that same team. If it's used by multiple teams, I would call it a shared service.

If it's a shared service, I think there are a few more things to consider:

- Quota and related. How well behaved will the clients be? Who pays for the resources used to serve a request? This is not an issue as a component.
- API evolution. How many version of the API will the service need to support, and how long will each version be supported. As a service, you need to make sure everyone has moved to the new version before killing the old one. As a component, you can choose to break their build or fix their code.
- Latency. Can you locate your service such that none of your clients are too far away?
- Support. Is there a team that primarily supports this service? How will they prioritize requests from other teams?

I think that in this specific case, I would probably stick to a component. It's already done, and seems like it should do the trick.

One downside of the component approach that you did not mention is binary size and compilation time since you depend on the whole implementation instead of just the interface.

Tue, 11 Nov 2014paul_hammant

For Google themselves I guess it does not matter whether a component is shared internally, as the costs to build/support are low: https://paulhammant.com/2013... and https://paulhammant.com/2014.... For them too, "fix their code" is the way it would be done (single commit), which informs support too, if there's only ordinarily a part-time team "owning" a module/component.

Good point about the size of the binary: it's 300K or so one compiled. I had a client once that pulled a 700K geoip library into their Java web app's WAR file to avoid a service.

You touch on payment for resources. It is of course MUCH easier to charge accurately for service invocations in a SOA architecture, but itemized billing itself can inflate the over all cost of the experience (it least it does in healthcare ;)

Tue, 11 Nov 2014Henrik Eriksson

I'd say it wouldn't matter unless you specifically want to use different runtime in-compatible code. The fault is to treat it as if there were any difference. Locality of code doesn't matter, APIs matters.

Wed, 12 Nov 2014Markus Kohler

With regards to performance, note that it might not be that important how long it would take overall (0.5ms) if it would be a web service, but rather how much longer it takes.I would assume in this case the network overhead is probably an order of magnitude.
The reason is if you would base all decisions just on that overall number you would want to do everything as a service, and then if you don't send your requests in parallel you would pay huge price with regards to performance.

Another reason i see for going SOA/microservices is whether what is done needs to be constrained with regards to resource consumption, e.g. memory cpu and I/O. Today most language runtimes do not support that, e.g. Java has a global heap, and is not able to restrict cpu usage for certain threads. The only mechanism that works in general at them moment is to use the operating system features to restrict memory and cpu usage per process (Linux cgroups). From what I know about the library it does not seem to me that it would need those restrictions.
If it's very I/O intensive (not the case for this library I guess) you might also want to put it into it's own server, to be able to scale it independently and to be able to put it on machines with different hardware (SSDs?).

One more reason is, if what you do could potentially break kill your server (fail over capabilities) E.g. if this library would be implemented in C and linked into Java it could easily crash the whole Java server. That could be another reason to put it into it's on process/SOA.

So to me it looks like there are good reasons for "Not to SOA"