Stuff Yaron Finds Interesting

Technology, Politics, Food, Finance, etc.

SOA and the End-To-End Morass

In SOA application modeling there are two basic approaches, end-to-end and hop-by-hop. The end-to-end model is based on an originating sender, a series of intermediaries and a final destination. In the hop-by-hop model each service only knows about the next hop service and nothing more. Below I argue that the end-to-end model inevitably leads to having to create a single protocol that the whole world has to support, requires a painfully sophisticated security model and all but requires that services be tightly coupled. The hop-by-hop model, on the other hand, suffers from none of these problems but does introduce extra latency. On balance I don't believe the benefits of the end-to-end model justify its costs and therefore recommend that service implementers use the hop-by-hop model.

End-To-End Versus Hop-By-Hop

The concept of end-to-end functionality in SOA means that a message will pass from a source service, through a number of intermediary services, to a final service. In the end-to-end model the 'real' communication is between the source and the final receiver, all the other services, the intermediaries, exist for the purpose of adding value to the message before it reaches its final destination. A classic end-to-end scenario is a purchase order being sent to a transformation service intermediary and then to a logging service intermediary and then finally to the actual purchase order processing service.

The alternative to the end-to-end model is the hop-by-hop model. In the hop-by-hop model once one service sends a message to another service the message becomes the receiver's problem. In the hop-by-hop model communication occurs one step at a time rather than the end-to-end model's idea of the communication extending from the source service all the way through the intermediaries to the final service.

For example, let's apply reliable messaging to the previous scenario. In an end-to-end model the reliable message contract would actually be between the source service sending the PO and the purchase service. The intermediaries, the transformation and logging services, are only incidental. This means that the source service will keep around all of its reliable messaging information and state until it receives an acknowledgment from the purchase order service.

In a hop-by-hop model the reliable messaging exchange only exists between each hop individually so there is no state stretching down the entire message chain.

A consequence of the end-to-end model is that the sending service often needs to provide explicit instructions to the intermediaries. So this means that in the end-to-end model one has to have a message format that allows one to target commands to the final service, a specific intermediary and/or to all intermediaries. In other words the original message with the PO would also include a message addressed to the transform service telling it what kind of transformation to implement, a message to the logging service telling it what kind of logging quality to provide and then a message to the final service telling it what to do with the PO. In a sense an end-to-end system requires a linear processing system that lets a set of commands, addressed to different services, to be stacked up inside of a single container message. Each step in the processing path then takes off the part of the message addressed to it, does it's part and then passes on the message to the next service.

In the hop-by-hop model the message handling is typically less sophisticated. The source service would send its PO to the transformation service and get back the result. Then the source service would send the transformed message to the logging service and get back a confirmation. Then the source service would send the transformed PO to the PO processing service. In other words, hop-by-hop systems tend to be hub and spoke based. In the most generic case where large numbers of services are involved there are typically multiple hubs each of which has its own spokes. This hub and spoke design is typical of how enterprise service bus (ESB)'s work. Usually there will be a coordination service inside of the ESB that will be responsible for calling various local services in a specific order, handling errors, etc. before passing the message onto the next (external) hub.

The Cost of the End-To-End Model

One Application Transport to Rule Them All

In the end-to-end model the expectation is that the source service will be able to communicate with both intermediaries as well as the final service. This means that there needs to be a single standard message type that can contain instructions to the intermediaries as well as to the final service. But, just to make things more complex, a common assumption of end-to-end systems is that the various hops a message takes might be over different transports. E.g. one hop could be over HTTP, the next of JMS, the next over SIP, etc. This means that the end-to-end model requires a message format that can survive all the hops and contain instructions to the various intermediaries.

In English what this really means is that the end-to-end model requires that everyone use the same transport protocol. But this is a challenge since people like their transports. But computer science tradition provides a solution – add another layer of abstraction.

First, the end-to-end system introduces its own über protocol that will be tunneled through all the other protocols, let's call this new über protocol SOAP. Of course, we can't rely on HTTP URLs since they may not be available at each hop so let's introduce a new addressing mechanism, let's call it, I don't know, um… WS-Addressing. And of course reliable messaging has to be based on the new protocol format rather than any underlying transport features (e.g. no SOA-Rity). Let's call this new reliable messaging functionality WS-ReliableMessaging.

The end result is that for the end-to-end model to work it requires that the entire world switch to a single transport protocol and treat all other transport protocols as little more than expensive implementations of TCP/IP. The inevitable irony being that in the next round of creation someone will show that SOAP isn't end-to-end in SOA++ (or whatever the next thing is called) and SOAP itself gets encapsulated. But that's another story.

In the hop-by-hop model each communication exists on its own terms and can therefore use whatever protocol best fits its need. There is no requirement to create an über protocol.

Security Headaches

In an end-to-end architecture the source services and intermediaries are expected to be aware of other intermediaries as well as the final service and to communicate with those intermediaries directly. In that case one reasonably expects that the source service will have to authenticate itself not just to the final service but also to one or more intermediaries. This was certainly the case with HTTP where clients can authenticate themselves to both a proxy and to the final service.

Ahh but the fun doesn't stop there. Once one accepts the concept of direct intermediary communication then the next step is encryption. Encryption is really fun because in an end-to-end model one can't assume that all the intermediaries and the final service share the same encryption key. So minimally it has to be possible to encrypt the same content multiple times with different keys. To make things even more fun it's likely that some of the intermediaries need to handle content that doesn't need to be encrypted at all. So not only does one need to be able to encrypt data with multiple keys but one has to encrypt pieces of the data so that other pieces can be left in the clear or be encrypted with a different key. Keeping all this in mind suddenly XML DSIG, XML Encryption, WS-Security, etc. start to make sense.

In a hop-by-hop system the general design principle is that each hop only knows about the next hop. So authentication and encryption tend to not extend beyond the next hop. In a hop-by-hop system trust tends not to be across the entire message chain but rather between different systems along the chain. In essence this creates a transitive trust chain where the next hop only worries about trusting the previous hop and recursively backwards to the originating system. Yeah, I too can come up with scenarios where transitive trust is problematic but back in the real world it just doesn't seem to matter terribly often. Even when it does matter usually much simpler solutions like adding an encrypted or signed attachment (rather than inventing a partial message encryption/signature system) will do nicely.

Tightly Coupled

There's also another cost to the end-to-end model that is a bit harder to see. end-to-end is not loosely coupled. The end-to-end model is only useful when a service (either the source or an intermediary) has some knowledge (although certainly not complete knowledge) of what services are ahead of it in the processing path and what their capabilities and supported formats are. This locks in a particular set of protocols and functionalities. If any of the upstream services should change their protocol, format, etc. then the rather than just negotiating the change with the immediately previous service they have to negotiate the change with all downstream services. This puts in place a series of overlapping tight couplings that make it somewhere between difficult to impossible to make any changes to a deployed service.

In the hop-by-hop model each service only communicates with its next service and so changes are localized.

The Cost of the Hop-By-Hop Model

The main cost to the hop-by-hop model is that it depends on using a series of hubs and spokes. This will increase the latency needed to process a request. In the end-to-end model each service just hands a request on to the next service. But in a hop-by-hop model a hub will hand out a request to a service, get back a response, hand it to the next service, get a response, etc. until it is ready to move the message to the next hub who will repeat the whole process. The actual performance penalty is typically not that high however because usually the services around a spoke are co-located so the extra round trip time isn't very high.

Conclusion

The end-to-end model certain has advantages in terms of performance and sophisticated security scenarios but the question one has to ask is – are these advantages worth the price of having to enforce a mono-culture protocol, introduce mind numbingly complex security models and implement a tightly coupled system? Based on my own experiences with various service based models and having lived through the last few years of WS-* I can't but conclude that the end-to-end model cost more than it's worth and therefore recommend that implementers adopt the hop-by-hop model. Simplicity, it seems, has much to recommend it.

7 Responses to SOA and the End-To-End Morass

Leave a Reply

Your email address will not be published. Required fields are marked *