In a previous article I had argued that the end-to-end model was a bad one for SOA. In comments on that article Nick Gall made the point that I was using the term end-to-end incorrectly. I countered that the meaning of the term was different for application protocols than for transport protocols (where Nick's usage came from). Below I explain how HTTP's violation of the 'end-to-end principle' sent application protocols on a path where the very idea of 'end-to-end' changed with unfortunate consequences for SOA.
(Nick, remember, you asked. :)
It's the Proxies What Done It!
One of Nick's comments on my end-to-end article uses the classic example of how end-to-end and hop-by-hop are different in transport and application protocols – HTTP caching proxies.
Following the end-to-end principle as it applies at the transport layer one wants routers (read as "intermediaries" by application protocol types) to be very dumb and to do as little work as possible (but do it as quickly as possible, hence the old saying – dumb and fast). This means that any work routers do besides just routing (e.g. firewalling, link level encryption, etc.) should be completely invisible to the end points.
So applying this principle to an application protocol like HTTP one wants HTTP intermediaries, mostly caching proxies, to be dumb and fast. But they aren't. They are hideously intelligent and slow, which has led to serious problems.
When HTTP was first designed and caches were added the general thought was that HTTP caching proxies would be largely (but not completely) 'invisible' in a similar (but again, not exactly equivalent) sense that a router that uses link local encryption between itself and another router is 'invisible' to the end points. Some companies (Cisco comes to mind) even invented 'invisible/transparent' HTTP caching proxies that they put directly into their IP routers. The idea being that the client sends off a request and gets a response without ever knowing if the response came from a proxy cache or the HTTP server.
The end result (as I can testify from the days when I was the Program Manager for WinInet, the HTTP/FTP stack in IE) was a disaster. The problem is that a cache response is not the same thing as a response from the destination server. For example, in some cases you want to be guaranteed that you are getting the absolute latest response and explicitly don't want a response from a cache. In other cases caches behaved badly (e.g. they might ignore cache content expirations or re-compress pictures in order to save space but end up destroying quality or not even implement the HTTP protocol incorrectly, etc.) and you needed to know that such a cache was in your communication chain so you could work around the cache's bad behavior. In other cases you didn't even have 'end-to-end' connectivity, the local network would literally reject all port 80 TCP requests (e.g. HTTP requests) if they didn't go through the proxy which introduced several years of nightmares in the area of automatic (and not so automatic) proxy configuration for browsers.
The bottom line is that the whole concept of the invisible intermediary just doesn't work outside of a true 'end-to-end principle' compliant system where intermediaries are dumb and fast. As soon as the intermediaries start getting 'smart' and manipulating things very quickly their 'invisibility' isn't and you need to talk to them, instruct them, etc.
Indeed, go look at RFC 2616 and you will notice that big chunks of the spec are all about caching. When I represented IE in the IETF HTTP Working Group I spent a bunch of my time involved in arguments about caching issues because of all the problems we had getting good behavior through caches.
In other words the cost of violating the end-to-end principle in HTTP was that HTTP became a substantially more complex protocol. No longer could a source just worry about talking to its destination. Now the source had to worry about talking to all the intermediaries on the communication path as well as the destination. There is a very big cost for violating the end-to-end principle's admonition to keep your infrastructure dumb and fast.
Where's the End in End-To-End?
In the case of HTTP however the cost was well worth paying. Especially in the early days of the Internet when bandwidth was at a premium. Without HTTP proxies early Internet access providers like AOL would not have so quickly jumped on the HTTP bandwagon. So yes, violating the end-to-end principle had some very painful consequences but the benefits were worth the price.
A consequence, however, of the HTTP experience was that the very idea of end-to-end and hop-by-hop started to change in the application protocol world. For example, in TCP all the header fields are targeted at the destination. If the routers (e.g. intermediaries) do anything it should be invisible so there is no need to target any information at them. But in HTTP that isn't the case. In HTTP there was a need to distinguish between information targeted at intermediaries and information targeted at the end point.
When Henrik and friends designed SOAP they took the HTTP experience to the next logical level. They simply accepted that the end-to-end principle's admonition that intermediaries were to be dumb and fast would not be honored and instead put in an explicit framework to allow for intermediaries to be intelligent and to be directly addressed as part of the delivery of a message. This actually led to some weird consequences. For example, from time to time I used to drop by the SOAP 2.0 and WSDL 2.0 working groups in the W3C (in support of BEA's representatives to the group David Orchard and Mark Nottingham). When those groups talked about end-to-end they started introducing new terms like 'ultimate destination' and 'ultimate sender' to distinguish between the 'observable' receivers or senders which were just as likely to be intermediaries as the actual 'ultimate' source or destination. The problem was that once you accept smart intermediaries it gets really hard to figure out where the start and end of a message path actually is. The whole concept of what the 'end' is in 'end-to-end' gets fuzzy.
What was really happening is that the whole concept of an 'end point' as a single process on a single machine was disappearing. Intermediaries were getting so bloody smart (at least in theory, in practice, btw, I can't find many customers who use SOAP level intermediaries in a non-trivial real world situations) that in practice the intermediaries become part of the 'end point'.
In essence the collection of intermediaries could be thought of as forming a single virtual end point where the end point logic was distributed across multiple nodes. This 'virtual end point' problem wouldn't have mattered if the distributed end points appeared to be a single end point to the outside world but in practice the SOAP folks seemed to want to expose the fact that the 'virtual end points' were 'virtual' and that they did include many intermediaries and so the headaches just got worse (a classic example of this problem is who exactly acknowledges a message in a SOAP level reliable messaging protocol? An intermediary? The ultimate destination?). Hence terms like 'ultimate destination' showed up.
Simplicity Really Is Good
So slowly, without anyone even realizing it, the whole idea of end-to-end had changed. Now end-to-end was about moving a message from the 'ultimate sender' to the 'ultimate destination' and addressing all the intermediaries in between. Hop-by-hop on the other hand had taken on the meaning of sending a message point to point without a bunch of intermediaries and/or addressing information just to the next hop in an end-to-end design (e.g. RFC 2616 usage). But the key here is to note that end-to-end in the IETF sense of smart end points with dumb intermediaries was well and truly dead. Intermediaries were not dumb and fast, they were smart and slow.
While I believe the case for HTTP violating the end-to-end principle was a strong one, the point of article Nick responded to was that I don't believe violating the end-to-end principle for SOA scenarios is worth the cost. Unlike caching in HTTP, I don't believe there exists an equivalent high value feature set that is worth the 70+ specs of crud that Web Services in particular have dumped on us as they attempt to make intermediaries smarter and smarter. My argument is that in SOA we are better off not having smart intermediaries and instead radically simplifying our SOA protocol stack and moving back to a true end-to-end model. Yes, there can be benefits to smart intermediaries but I don't believe those benefits outweigh the costs in the case of SOA.
But the irony is that I couldn't use the term 'end-to-end' in the way I do in the paragraph above and be understood by the application protocol people I communicate with on a daily basis. The term 'end-to-end' now means what HTTP made it mean. It is all about sending a message from the (using the SOAP term) ultimate sender to the ultimate receiver. Thinking of end-to-end and hop-by-hop in this way doesn't make much sense in the world of transport protocols where intermediaries are truly invisible but since HTTP and SOAP it is how these things are thought of in the application world.
But Prove it!
In Nick's comments he challenged me to come up with printed examples of people using end-to-end and hop-by-hop in the way I use the terms in the article. I doubt I can come up with examples that will satisfy him. Terminology is constantly changing and I had to pick the terms I felt would be best understood by my target audience. Based on my conversations with the people I work with in the standards world as well as with customers my general impression is that I used the terms end-to-end and hop-by-hop in the manner they are generally understood by the community I am a member of.
The distinction between end-to-end in transport and application protocols is a subtle one. While end-to-end in the transport world mostly means the 'end-to-end principle' and the idea of smart edges and dumb, fast intermediaries in the application world end-to-end involves smart and by necessity very visible intermediaries who are full partners in the transmission of application messages. The difference in world views leads to very different architectural outcomes. The irony being that in the article that started this discussion I was actually arguing for SOA (machine-to-machine communication) to return to a true end-to-end design (in the transport protocol sense, e.g. the end-to-end principle) and away from the smart intermediary design heralded by HTTP.