Stuff Yaron Finds Interesting

Technology, Politics, Food, Finance, etc.

Multi-Protocol Support – or – Yes, there is more than just HTTP

There will not be one protocol to rule them all.

Setting the Stage

In order to be a generic request/response protocol the protocol must meet a couple of basic requirements, It must:

  • define how requests and responses are sent
  • define how to identify the "purpose" of a request and provide space for additional parameters explaining the "purpose"
  • define how to gracefully handle requests with an unrecognized purpose and/or parameters
  • enable one to easily return responses with generally understood semantics, meaning at least one can reliable differentiate success from failure.
  • enable one to relatively easily and cheaply send large chunks of arbitrary binary streams

A protocol does not natively have to have all these features, they can be layered on top. In fact, any protocol hit with sufficient force can be made to do anything. However, the more of these features a protocol natively supports the more likely it will be hijacked and treated as a generic request/response protocol.

HTTP has all five of these features. I realize that not everyone likes how HTTP implements these features but that is an issue for a future article.

As soon as people realized that HTTP could be treated as a generic request/response protocol they felt a deep need to unify everything. So if one has a generic request/response protocol then clearly no other protocols should be necessary. Why must we have FTP, SMTP, POP, IMAP, SNMP, LDAP, etc.? Why can't we just have HTTP?

Eventually the enthusiastic grand unifier finds out that all those "legacy" protocols are not just going to go away. People have large investments in those protocols and they aren't going to throw them out. A particular astute grand unifier might even figure out that there is a reason why various protocols co-exist with HTTP and that they provide value in certain areas that more than justifies the expense of supporting multiple protocols. Furthermore the day will come when we will be freed from the various design flaws of HTTP and a new set of acronyms will come to bedevil our lives. Either way, HTTP is never going to be "it" so one better plan for handling all the other protocols that one will be forced to support.

Two Design Patterns

Upon realizing that one has to inter-operate with multiple protocols the grand unifier goes to plan B, if you can't just make these protocols go away then why not build translators between these protocols and HTTP?

It turns out, however, that the term translator is misleading. Generally speaking translators fall into two different design patterns. In the first pattern, one doesn't actually try to connect Protocol X and HTTP directly. Instead one tries to make Protocol X and HTTP both manipulate the same underlying "state". In the second pattern one tries to write a real time translator between Protocol X and HTTP.

Protocol X Client –> Protocol X Server –> "State" <– HTTP Server <– HTTP Client

Figure 1 – Design Pattern One – Unification at the State API level

Protocol X Client –> Protocol X Server to HTTP Translator –> HTTP Server –> "State"

Figure 2 – Design Pattern Two – Protocol Translator

A classic example of the first design pattern is Microsoft™ Windows 2000™. Windows 2000™ supports access to its file store through SMB (a.k.a. CIFS), FTP and WebDAV. Similar schemes have been proposed for SMTP, IMAP, POP and LDAP. In each case the "unification" consists of making the same "state" available through multiple protocols, one of which is HTTP. The unification work is limited to building a "state" manipulation API that is capable of handling the various protocols. This reduces the complexity of the interoperability problem to its absolute minimum.

The second pattern tends to show up in systems that dynamically generate responses. In these systems the "state" being manipulated by the protocols is not represented by databases or file systems. Rather there exist programs that generate output. The generated output is typically targeted at a particular protocol, usually HTTP/HTML. So rather than carefully separate out the underlying logic and the protocol code, most systems munge it together.

For example, imagine a business website that uses HTTP/HTML request/responses to accept orders from forms. The business will typically put a lot of their workflow logic in the servlets/cgis/etc. that they use to accept the HTTP request/responses. When it turns out that they need to accept orders through a different protocol, EDI, FTP, etc., things tend to fall apart. Rather than try to separate out their workflow logic from their HTTP/HTML logic they try to layer the new protocol over their existing HTTP infrastructure. That is, rather than separating out their underlying state logic and their protocol specific logic and hence being able to use design pattern one to solve their problem, they mangled them all together.

There are three basic problems in following the second design pattern:

Fighting the protocols – Protocols are designed to manipulate/query a shared "state" in a well-defined manner. Protocols are not designed to be translated to/from other protocols. Therefore when one follows the second design pattern one is forced to move away from using a protocol in the manner for which it was designed.

Expanding the problem – When providing multiple protocols over a single "state", interoperability is implemented at the "state" API level. Therefore the only problem one has to directly deal with is how to create a "state" API that is flexible enough to meet the needs of the various protocols it will serve. When translating between protocols one also has to determine what part of a protocol's "additional" semantics, for example caching, content negotiation, security, connection management, control flow, etc., one needs to move from one protocol to another. In other words, one has to deal with a large problem set.

Bad Performance – Protocol stacks tend to be big and expensive to run because they are designed to run over unreliable networks while talking to badly written and only loosely protocol compliant clients. When one translates between protocols one has to pay the protocol stack price, twice.

Therefore, whenever possible, one should use the first design pattern. If you cannot do this then you're stuck. There really isn't any magic advice. At this point the best lesson is to accept that this will happen again, there will always be multiple protocols, you will always have to support them so in the next project, plan for this to occur.

Into the Future

Moving forward we actually see HTTP being de-emphasized in favor of new "transport independent" solutions. Of course, there is no such thing as a transport independent solution but that is a subject I will touch upon in another article. In the meantime some solutions are doing a good job (SOAP) and some a mediocre job (ebXML) of factoring HTTP into core services such as caching and authentication which are independent of the actual command content. Efforts like BXXP are trying (not terribly successfully in my opinion – another subject for another article) to create properly factored application protocol frameworks that separate out commands from ancillary application protocol services such as reliable deliverable, request/response handling, etc. This new world order will create its own problems as we discover that, yes Virginia, there will be multiple "protocol independent" protocols and some won't even be in XML. Of course, changing/replacing XML would be a good thing in itself but, yes – you guessed it, that's a subject for another article.

In Summary

I hope the reader takes away the message that if they are doing anything but the most trivial logic on their web site they are going to end up having to support a variety of protocols and formats. By accepting this from the start and planning for it, they will save themselves a lot of grief. Additionally, I hope the reader takes away the message that the best way to support multiple protocols/formats is not to create a protocol stack with HTTP at the bottom. Rather, one should separate out one's "state" logic from one's protocol logic and build multiple protocol servers on top of the "state" APIs.

Leave a Reply

Your email address will not be published. Required fields are marked *