Designing Service Oriented Architecture (SOA): Latency & Re-Use, what you must get right and what you can afford to get wrong

In designing a SOA based system I believe the number one concern should be latency. As I discuss in my article on RPCs and Protocols, getting latency right is extraordinarily hard. So my advice to people designing SOA systems is – worry about latency. If you manage to get past latency then you can spare a thought for re-use, this is where loose coupling and versioning come in. But in my opinion most people can safely ignore both issues. After all, even if you try to tackle them you're likely to get it wrong (and no, that's not an insult, the 'experts' who design nothing but protocols and protocol implementations screw it up all the time) so why waste time and energy when you could be focused on shipping your service?

What is SOA?

Ignoring the hype (which admittedly is something of a feat in itself) SOA boils down to – networked code is good. True, this has been understood literally since at least the dawn of modern computing and if I wasn't too lazy I'm sure I could find Sci-Fi examples that predate modern computing. What is new is that the cost of networking has become so ridiculously low that everyone can get in on the act.

It's tempting to argue that the idea of 'service' in SOA is somehow special but it's really not. In theory the difference between a service and a non-service oriented networking system is that a non-service oriented networking system consists of a random collection of network accessible interfaces that are all part of a single monolithic program. In a services approach one has collections of well defined, carefully related interfaces that make up individual service instances that exist largely (but in contradiction to the myth of statelessness, not completely) independently of each other. The most immediate predecessor to the service is HTTP web services where each transaction (in the general, not ACID, sense) or group of related transactions were given their own unique service instance. It turns out that using a bunch of unique instances instead of a single huge chunk of code servicing everyone has significantly better performance, mostly because it makes it easy for service instances to be spread out across many machines so the system can be made to scale linearly by adding more boxes.

But we shouldn't get too excited by the obvious. Back in the dusty past when a computer was a single monolithic machine the concept of a service probably also existed but it was probably internal to the software instead of used to create complete separate computational entities. The UNIX Fork command, an archetype of the service paradigm, appears to go back at least to the 1970s and Dijkstra had a paper on process based computing in 1965. All that is 'new' today is that economically, in many cases, it's cheaper to have many powerful computers rather than one super powerful computer and so services got pulled out of the software and made into stand alone entities. It's an important concept but we aren't exactly inventing the wheel here.

It's Latency, Stupid

Still, to belabor the obvious, networking matters and so do services. In dealing with services I suspect the first hurdle for most people will be that writing networked code is a fundamentally unnatural act. As I discussed in a previous article on the difference between RPCs and protocols, writing networked code means dealing with latency and dealing with latency means thinking in 'chunks' of functionality instead of in finely separated pieces. Give the article a read, I think it does a good job of explaining the issues. My guess is that surmounting this problem will take the enterprise application development community a non-trivial period of time. It really does require re-thinking a lot about the fundamentals of one's program design.

Re-Use

But once the community makes peace with latency the next big challenge will be re-use. Writing a good service is hard and once one successfully does so there will be a strong urge to re-use that service in other contexts. Re-use is really hard as it requires mastering two difficult skills – loose coupling and versioning.

It is easy to design a service so that its interfaces are useful only to a specific version of the specific service it was meant to interface with. But when another version of that partner service is created or when new services are introduced that could re-use the existing service, well, things can get sticky. The solution to this problem is loose coupling. Loose coupling is really just another phrase for 'good design' and can be read as "don't screw up". Generally when people talk about loose coupling they talk about the ability to add or remove information from messages without breaking services that are trying to consume the message. But honestly, I think that misses the point.

Changing an interface just isn't that big a deal, you can easily throw on new ones as needed. The really big deal with loose coupling therefore isn't interface design, it's how you structure your code. Successfully implementing loose coupling means designing one's code so it is free of assumptions about who you are or will ever want to talk to. That way who you are talking to can change and your code can just roll with it.

It's one thing to say "I'm writing service A to work with service B." But what loose coupling demands is that one say "I'm writing service A to work with service B, both the current and all future versions of B, as well any other kind of service that might need to re-use what I'm providing." Maybe there are architects and developers who are good enough to pull off a re-usable design, but let's face it, architects at that level are rare and most companies would not necessarily profit from employing them. Sometimes it's cheaper to get it wrong. So personally, I wouldn't stress about loose coupling. I realize my view is heresy but I tend to doubt there will be much profit in beating one's developer's bloody trying to write a service to be 'generic' and thus meet all sorts of requirements that don't actually exist yet.¹

I believe this 'get it wrong' philosophy also applies to versioning.

Most folks reading this article are writing what I call a N:1 service, that is, a service for which there is only one running example. As I discuss in detail here, most folks with N:1 services are better off not worrying too much about versioning. As my article explains, most people can safely navigate their way to the next version of their N:1 service without a lot of preparation ahead of time so don't sweat it. Sure, you would probably save an enormous amount of money if you got it all right and could properly handle versioning but then again the number one challenge for most projects is shipping at all.

As awful as it sounds attempts to get re-use right are more likely than not to fail so it's probably better to just focus on getting the first version of your service out the door.

Conclusion

Some problems you have to solve because not doing so means your system will fail. Other problems you want to solve because not doing so will cost you later. The issue of latency falls into the first category. If your service can't deal with latency then it will likely fail. Re-use falls into the second category. Sure, getting it right the first time will save you a lot of money later but chances are you won't get your re-use strategy right anyway. So it's probably better to focus on getting the service out the door and letting re-use take care of itself. After all, if you have a re-use problem it's because your service was a success. So I see re-use as one of those problems you'll be lucky to have.

1Note, however, that none of this excuses one from following good coding practices that the ideas behind loose coupling have inspired. For example, when you read a XML message make sure to only look at the parts of the message you actually need and ignore the rest of the message. That way parts of the message that don't affect you can change you without breaking you. This is, of course, nothing more than a restatement of one of the oldest principles of good network code design – be generous in what you accept and stingy in what you send.

What is SOA?

It's Latency, Stupid

Re-Use

Conclusion

Leave a Reply Cancel reply