So what's the difference between RPCs and Protocols and why does it matter?
(Note: Jim Whitehead wrote a great letter after reading this post. When I get some time I'm going to update this article to reflect his comments.)
Programming languages are a reflection of human though. Humans tend to segregate ideas into well-defined pieces, each piece sufficiently small to be wholly comprehensible as a single unit.
In programming language these well-defined pieces are called functions/methods/interfaces/objects/etc. For simplicity's sake we will just call them functions since, eventually, everything in most programming languages boils down to a function call.
With the advent of networking it seemed natural to allow for the leveraging of these functions by enabling function calls to be made across the network. The act of making a function call on one machine but having the call executed on another machine is referred to as a Remote Procedure Call (RPC).
RPCs allow for the rapid development of network protocols because they enabled function based programming, something humans have repeatedly proved themselves pretty good at.
When RPCs were largely limited to local area networks with low latency things worked quite well. Because RPC calls tend to represent small chunks of functionality the performance limitation wasn't the network as much as how fast the machine could process an incoming request. Optimized RPC implementations could routinely handle thousands of requests per second so everything worked just fine.
Problems started to occur when attempting to run RPCs over high latency networks. Here the limiting factor is the network, specifically latency and this is where RPC's drawbacks began to outweigh its benefits.
Because RPCs are designed like functional programs they tend to follow the normal human way of thinking and break each problem into small, well-defined chunks. Each chunk was assigned its own function call. To perform any significant amount of work one had to call numerous function calls. Each of these function calls turned into a network round trip. On a high latency network such as the Internet these round trips were quite expensive. The result was that RPCs didn't tend to work very well across the Internet.
This is where protocols came in. Protocols are tuned for latency rather than human comprehensibility and therefore are able to achieve significantly higher performance across high latency links by aggregating many functions into a single round trip.
However the process of designing well-aggregated messages is contrary to normal human thought processes and so is difficult for programmers to deal with. This is why most APIs that enable a program to generate protocol messages do not directly mirror the actual protocol message format.
For example, most HTTP APIs have different function calls to do things like set cookies, set headers, set the body, etc. In reality each of these function calls is helping to build up a single HTTP request. But if there were only one function that required all these parameters to be set in one go most programmers would get overwhelmed. By breaking the problem up into smaller parts, headers, cookies, etc. The problem is made more tractable for programmers.
Hence RPCs present a trade off between performance and rapid development. One can either quickly develop a low performance RPC or slowly develop a high performance protocol.
An RPC/Protocol Example
A typical example of RPC latency performance problems are file system RPCs like Microsoft's SMB.
Imagine we are writing a program that needs to open a file, read the first 100 bytes, skip 50 bytes and then read another 100 bytes. In most programming environments this requires 5 function calls, each of which turns into a network round trip.
1) A call to open the file.
2) A call to set the stream pointer to the start of the file.
3) A call to read in the first 100 bytes
4) A call to move the stream pointer 50 bytes
5) A call to read in another 100 bytes
An optimized RPC system can usually either get rid of or pipeline request 1 and request 2 so we can treat this scenario as only requiring 3 round trips. On the Internet each round trip can range from a couple hundred milliseconds to a few seconds. So a user, in the average case, will probably have to wait somewhere between 1 second to a full minute for this operation to execute.
Now let's look at how a protocol would handle the same scenario. In this case we will use HTTP.
HTTP doesn't have the concept of opening a file or stream pointers so calls 1, 2 and 4 aren't relevant. This leaves a naïve HTTP implementation with two calls:
3) A HTTP GET request with a range header to get the first 100 bytes
5) A HTTP GET request with a range header to get bytes 150 through 250
In reality most HTTP clients would just pull down the entire file in a single round trip and make that file available in a buffer that the program could then read out of. Therefore the total number of round trips ranges from 2 to 1.
Moore's law for Networks?
It has been observed that the rate of performance increase for networking technologies actually exceeds the rate set by Moore's law for computing performance. [Ed. Note: I can't find the performance article where I saw the performance curve listed] Therefore it would seem that if we just wait a little while the latency problems that cause RPCs to have bad performance will go away. This is an ideal situation because protocols just add an extra layer of complexity and one more place for things to go wrong. In an ideal world we would just have RPCs so that what programmers wrote directly turned into network communications.
However, assuming the maximum speed of light in a vacuum remains a constant, no increase in network bandwidth, router performance, encoding performance or any other measure of network performance will reduce latency to a point that latency is no longer a major concern. The reason is that the size of the Earth is large enough and the speed of light slow enough that latency will always be a factor. An example may help to illustrate the issue:
Diameter of the Earth at the Equator = 12,756,320 m
Speed of light in a perfect Vacuum = 299,792,458 m/sec
Fastest possible time for a bit to travel from one point on the equator to its opposite point = (12,756,320) m/299,792,458 m/sec = 0.04 seconds.
Note that the previous assumes that there is a hole drilled through the middle of the earth that has been filled with a perfect vacuum. In reality the actual transmission time will be much worse because the message will have to either follow the curvature of the earth or be bounced off satellites, in the meantime it will have to pass through multiple amplifiers and routers which, even when optically based, will still slow the message down. So the actual performance is likely to be substantially worse. The most interesting case is the total round trip time, which is twice the one-way travel time so the actual delay is 0.08 seconds. If one were to take a program and insert 0.08 seconds between each function call the resulting program would not be very popular.
Of course the example is simplistic since not every function call actually goes across the network but one gets the general idea.
Optimizing RPCs to handle latency
As alluded to previously some RPCs have tried to improve their network performance by examining function calls and determining which calls can be ignored (since they only represent local activity), which can be pipelined together (so they only take one round trip), etc. These activities do help to improve performance but eventually one reaches the point where one has little choice but to introduce function calls that do more. That is, function calls that have more parameters, more arguments, return richer results, etc. The more activities a single function call represents the fewer functional calls are needed, the fewer round trips, the better the performance.
However creating bigger, more complex function calls violates RPC's benefit, which is that it could take structures that humans can deal with easily and turn them automatically into a network protocol.
As a side note, this is the issue that makes the RPC vs. Protocol arguments so heated. There is no reason why an RPC cannot be as good a performer as a protocol. In fact, for reasons I'm not going to go into here, a latency optimized RPC will usually have better performance than most modern protocols. However, as discussed above, the types of things one has to do reduce RPC latency also rob RPC of its primary benefit – mirroring the way humans think.
Where to from here
When a programmer needs to quickly throw together their own custom, network based program, RPCs are probably their only realistic choice. So long as the program is being run in a local environment with low latency this shouldn't be a problem. Quite a bit of work has gone into making RPCs better performers but in the end so long as we are moving around small bits of functionality the round trip issue won't go away.
Designing high performance protocols is hard and most likely beyond the ability of most programmers. It is an issue that will have to be left to the experts for the foreseeable future. The issue is then – is there a way to make it easier for normal programmers to use these high performance protocols?
To date the standard way of dealing with this issue is that the protocol experts get together with a group of developers and develop a library that normal programmers can use to access the protocol. The results have often been rather disappointing. It turns out to be very difficult to create API libraries that are both easy to use and result in good network performance. What's worse is that the rate of introduction of new protocols is increasing [Ed. Note: I haven't actually proven this, it's just my own observation] but the number of qualified protocol experts isn't. This means that more people with less training are going to be creating protocols that must have good performance.
This situation would seem to call for some kind of design methodology that increases the probability that a protocol will be high performance and provides tools to enable APIs to be automatically generated from those high performance protocols in such a way that the resulting APIs will be both easy to understand and tend to encourage the programmer to write code that is likely to perform well.
These challenges are certainly not easy and do not match the way people develop protocols today but they do seem worth working on. We can see some steps in this direction through efforts like WSDL but there is obviously still a long way to go.