Stuff Yaron Finds Interesting

Technology, Politics, Food, Finance, etc.

End-To-End Confusion – The Changing Meaning of End-To-End in Transport and Application Protocols

In a previous article I had argued that the end-to-end model was a bad one for SOA. In comments on that article Nick Gall made the point that I was using the term end-to-end incorrectly. I countered that the meaning of the term was different for application protocols than for transport protocols (where Nick's usage came from). Below I explain how HTTP's violation of the 'end-to-end principle' sent application protocols on a path where the very idea of 'end-to-end' changed with unfortunate consequences for SOA.

(Nick, remember, you asked. :)

It's the Proxies What Done It!

One of Nick's comments on my end-to-end article uses the classic example of how end-to-end and hop-by-hop are different in transport and application protocols – HTTP caching proxies.

Following the end-to-end principle as it applies at the transport layer one wants routers (read as "intermediaries" by application protocol types) to be very dumb and to do as little work as possible (but do it as quickly as possible, hence the old saying – dumb and fast). This means that any work routers do besides just routing (e.g. firewalling, link level encryption, etc.) should be completely invisible to the end points.

So applying this principle to an application protocol like HTTP one wants HTTP intermediaries, mostly caching proxies, to be dumb and fast. But they aren't. They are hideously intelligent and slow, which has led to serious problems.

When HTTP was first designed and caches were added the general thought was that HTTP caching proxies would be largely (but not completely) 'invisible' in a similar (but again, not exactly equivalent) sense that a router that uses link local encryption between itself and another router is 'invisible' to the end points. Some companies (Cisco comes to mind) even invented 'invisible/transparent' HTTP caching proxies that they put directly into their IP routers. The idea being that the client sends off a request and gets a response without ever knowing if the response came from a proxy cache or the HTTP server.

The end result (as I can testify from the days when I was the Program Manager for WinInet, the HTTP/FTP stack in IE) was a disaster. The problem is that a cache response is not the same thing as a response from the destination server. For example, in some cases you want to be guaranteed that you are getting the absolute latest response and explicitly don't want a response from a cache. In other cases caches behaved badly (e.g. they might ignore cache content expirations or re-compress pictures in order to save space but end up destroying quality or not even implement the HTTP protocol incorrectly, etc.) and you needed to know that such a cache was in your communication chain so you could work around the cache's bad behavior. In other cases you didn't even have 'end-to-end' connectivity, the local network would literally reject all port 80 TCP requests (e.g. HTTP requests) if they didn't go through the proxy which introduced several years of nightmares in the area of automatic (and not so automatic) proxy configuration for browsers.

The bottom line is that the whole concept of the invisible intermediary just doesn't work outside of a true 'end-to-end principle' compliant system where intermediaries are dumb and fast. As soon as the intermediaries start getting 'smart' and manipulating things very quickly their 'invisibility' isn't and you need to talk to them, instruct them, etc.

Indeed, go look at RFC 2616 and you will notice that big chunks of the spec are all about caching. When I represented IE in the IETF HTTP Working Group I spent a bunch of my time involved in arguments about caching issues because of all the problems we had getting good behavior through caches.

In other words the cost of violating the end-to-end principle in HTTP was that HTTP became a substantially more complex protocol. No longer could a source just worry about talking to its destination. Now the source had to worry about talking to all the intermediaries on the communication path as well as the destination. There is a very big cost for violating the end-to-end principle's admonition to keep your infrastructure dumb and fast.

Where's the End in End-To-End?

In the case of HTTP however the cost was well worth paying. Especially in the early days of the Internet when bandwidth was at a premium. Without HTTP proxies early Internet access providers like AOL would not have so quickly jumped on the HTTP bandwagon. So yes, violating the end-to-end principle had some very painful consequences but the benefits were worth the price.

A consequence, however, of the HTTP experience was that the very idea of end-to-end and hop-by-hop started to change in the application protocol world. For example, in TCP all the header fields are targeted at the destination. If the routers (e.g. intermediaries) do anything it should be invisible so there is no need to target any information at them. But in HTTP that isn't the case. In HTTP there was a need to distinguish between information targeted at intermediaries and information targeted at the end point.

When Henrik and friends designed SOAP they took the HTTP experience to the next logical level. They simply accepted that the end-to-end principle's admonition that intermediaries were to be dumb and fast would not be honored and instead put in an explicit framework to allow for intermediaries to be intelligent and to be directly addressed as part of the delivery of a message. This actually led to some weird consequences. For example, from time to time I used to drop by the SOAP 2.0 and WSDL 2.0 working groups in the W3C (in support of BEA's representatives to the group David Orchard and Mark Nottingham). When those groups talked about end-to-end they started introducing new terms like 'ultimate destination' and 'ultimate sender' to distinguish between the 'observable' receivers or senders which were just as likely to be intermediaries as the actual 'ultimate' source or destination. The problem was that once you accept smart intermediaries it gets really hard to figure out where the start and end of a message path actually is. The whole concept of what the 'end' is in 'end-to-end' gets fuzzy.

What was really happening is that the whole concept of an 'end point' as a single process on a single machine was disappearing. Intermediaries were getting so bloody smart (at least in theory, in practice, btw, I can't find many customers who use SOAP level intermediaries in a non-trivial real world situations) that in practice the intermediaries become part of the 'end point'.

In essence the collection of intermediaries could be thought of as forming a single virtual end point where the end point logic was distributed across multiple nodes. This 'virtual end point' problem wouldn't have mattered if the distributed end points appeared to be a single end point to the outside world but in practice the SOAP folks seemed to want to expose the fact that the 'virtual end points' were 'virtual' and that they did include many intermediaries and so the headaches just got worse (a classic example of this problem is who exactly acknowledges a message in a SOAP level reliable messaging protocol? An intermediary? The ultimate destination?). Hence terms like 'ultimate destination' showed up.

Simplicity Really Is Good

So slowly, without anyone even realizing it, the whole idea of end-to-end had changed. Now end-to-end was about moving a message from the 'ultimate sender' to the 'ultimate destination' and addressing all the intermediaries in between. Hop-by-hop on the other hand had taken on the meaning of sending a message point to point without a bunch of intermediaries and/or addressing information just to the next hop in an end-to-end design (e.g. RFC 2616 usage). But the key here is to note that end-to-end in the IETF sense of smart end points with dumb intermediaries was well and truly dead. Intermediaries were not dumb and fast, they were smart and slow.

While I believe the case for HTTP violating the end-to-end principle was a strong one, the point of article Nick responded to was that I don't believe violating the end-to-end principle for SOA scenarios is worth the cost. Unlike caching in HTTP, I don't believe there exists an equivalent high value feature set that is worth the 70+ specs of crud that Web Services in particular have dumped on us as they attempt to make intermediaries smarter and smarter. My argument is that in SOA we are better off not having smart intermediaries and instead radically simplifying our SOA protocol stack and moving back to a true end-to-end model. Yes, there can be benefits to smart intermediaries but I don't believe those benefits outweigh the costs in the case of SOA.

But the irony is that I couldn't use the term 'end-to-end' in the way I do in the paragraph above and be understood by the application protocol people I communicate with on a daily basis. The term 'end-to-end' now means what HTTP made it mean. It is all about sending a message from the (using the SOAP term) ultimate sender to the ultimate receiver. Thinking of end-to-end and hop-by-hop in this way doesn't make much sense in the world of transport protocols where intermediaries are truly invisible but since HTTP and SOAP it is how these things are thought of in the application world.

But Prove it!

In Nick's comments he challenged me to come up with printed examples of people using end-to-end and hop-by-hop in the way I use the terms in the article. I doubt I can come up with examples that will satisfy him. Terminology is constantly changing and I had to pick the terms I felt would be best understood by my target audience. Based on my conversations with the people I work with in the standards world as well as with customers my general impression is that I used the terms end-to-end and hop-by-hop in the manner they are generally understood by the community I am a member of.

Conclusion

The distinction between end-to-end in transport and application protocols is a subtle one. While end-to-end in the transport world mostly means the 'end-to-end principle' and the idea of smart edges and dumb, fast intermediaries in the application world end-to-end involves smart and by necessity very visible intermediaries who are full partners in the transmission of application messages. The difference in world views leads to very different architectural outcomes. The irony being that in the article that started this discussion I was actually arguing for SOA (machine-to-machine communication) to return to a true end-to-end design (in the transport protocol sense, e.g. the end-to-end principle) and away from the smart intermediary design heralded by HTTP.

4 Responses to End-To-End Confusion – The Changing Meaning of End-To-End in Transport and Application Protocols

  1. Mark Baker says:

    “So applying this principle to an application protocol like HTTP one wants HTTP intermediaries, mostly caching proxies, to be dumb and fast.”

    No, you don’t. I agree with Nick, you’re confusing things here, as my comment on the last post also observed. The end to end model is about where in the *stack* (vertical) functionality should or shouldn’t be placed, not where on the network (horizontal).

    HTTP intermediaries *are* at the ends of the network because they are recognized as first class application layer entities. Yes, transparent intermediaries are bad from this POV; they break e2e because they are transparent. Transparent means that their functionality is effectively “in the network”, not at the application layer. But blame Cisco for those problems, not HTTP.

    Re-read that Dave Reed email (the whole thing) that Nick linked on his blog;

    http://www.postel.org/pipermail/end2end-interest/2001-June/000967.html

  2. Administrator says:

    Mark, I think you are conflating two situations I’m trying to pull apart. From a ‘transport’ perspective HTTP intermediaries are indeed end points and there is no problem.

    But the question is – can you apply the ideals of the end-to-end principles higher up in the network stack than just the transport?

    I believe the answer is yes. For example, one of the consequences of using the end-to-end principle at the application protocol layer is that messages are designed to be moved from end-to-end. Except in this case the ‘end’ is an application level entity not a transport level entity.

    Viewing the principle from this perspective it quickly becomes clear that HTTP proxies are in fact intermediaries and that HTTP’s design, which explicitly allows messages to address both the HTTP server and the proxies separately, clearly violates the end-to-end principle as applied at the application layer because one addresses both the “network” (in the form of the cache specific commands) and the “end point” (the HTTP server itself).

    But as I say in the article in reference to HTTP’s decision: “In the case of HTTP however the cost was well worth paying.” In other words, the end-to-end principle as adapted to application protocols isn’t necessary always a good principle to follow.

    However I believe that in the case of SOA it would be better to stick with the end-to-end principle as applied at the application protocol layer.

    I think the complications that SOAP intermediaries and the intermediary based web services specs have introduced are not justified given the paltry benefit they provide and the ridiculous complications they produce. I therefore believe that in the case of SOA specific application protocols one would do better to heed the end-to-end principle and just focus on application level ‘end-to-end’ messages where one designs the SOA protocols to assume they are speaking directly from one application level end point to another.

    I actually agree with most of what Dave Reed says in his mails but I don’t agree with his point that caches just introduce more ‘end’. In reality HTTP caches act as the ‘network’ (at the application layer) that messages must go through and suddenly that network has a lot of intelligence and starts messing with messages, often in ways the client neither expected nor desired. That’s why HTTP had to introduce so much machinery around caching. In essence they were trying to send commands both to the server end point and to the ‘network’ itself. This was an expensive decision to make in terms of protocol complexity but a worthwhile one for HTTP. I don’t however believe that a similar clear win exists for SOA and therefore argue against the use of an explicit intermediary model for SOA protocols.

  3. Mark Baker says:

    “But the question is – can you apply the ideals of the end-to-end principles higher up in the network stack than just the transport?”

    But the end-to-end argument of Reed et al, is only, by definition, a “transport” issue. Their concern is that application layer functionality should not be embedded in the network layer, not in the “network” per se.

    I don’t think you can apply reed:end-to-end to application layer intermediaries, except to say that application layer intermediaries should actually be *application* layer intermediaries and not network intermediaries (i.e. not be transparent).

    I also don’t think you need to stretch the definition of end-to-end to make your case that the SOAP intermediary model is problematic. IT clearly is, IMO, as it’s woefully underspecified.

    “but I don’t agree with his point that caches just introduce more ‘end'”

    I hope you can see that they do by his definition of “end”. It’s a different definition than you’re using though, I realize that now Perhaps you should pick a different term? It would avoid a lot of confusion.

  4. Administrator says:

    Keep in mind that the way I use endpoint in the original article is exactly the way it is used by folks like SOAP and WSDL. So the only way I can use a different term is if everyone else in the SOAP/WSDL universe does to!

    I do agree however that the overlapping usages are confusing but it’s unfortunately too late now to do much about it.

    In any case, I actually do think that the end-to-end principle has very useful things to say at the application layer. Not, mind you, as a statement of truth, but rather as one possible design direction. I would even argue that it should be the default design direction and only after compelling evidence is produced (such as the necessity of the HTTP proxy caching infrastructure) should it be violated.

    I don’t believe such evidence has been produced in the case of SOA. But do note, btw, that in the original article I never argued based on the end-to-end principle itself since the audience I was addressing wouldn’t really know what I was talking about anyway. Instead I argued from first principles.

Leave a Reply

Your email address will not be published. Required fields are marked *