For a long time there has been the distant promise that someday we would all just use IP multicasting for distributing content through the Internet. The idea that one could send a packet to one address and have it magically appear at multiple destinations was a compelling one. However IP Multicast has never taken off outside of Intranets. I believe that the fundamental reason for IP Multicast's failure to reach its promised potential is that IP Multicast does not scale very well. Specifically, each router on the distribution path of an IP Multicast must allocate memory to remember that multicast for the length of the multicast session. This means that as the number of multicast sessions that cross that router grow so will the amount of memory the router has to allocate. While the rate of increase of multicast sessions is exponential the rate of increase of memory required is linear.
What You Need to Know to Read This Paper
This paper assumes the reader has a basic level of familiarity with IP multicast. Specifically, they understand that an IP multicast is an IP message with an address from the reserved 'multicast' range and that when a message is sent to one of these addresses it is supposed to be delivered to destinations that have expressed interest in that address. These destinations could, in theory, be anywhere on the Internet. So, for example, if the President of the United States was giving a speech then in theory the President could send the speech out to a single multicast address and everyone in the world interested in the speech would receive it.
This may superficially sound like the same challenge as non-multicast point-to-point communication but the fundamental difference is that a point-to-point communication clearly specifies its one and only one destination address. This means that routers can be wrong. They can make mistakes in routing a packet. So long as some subset of the routers who see the packet have a good idea on how to route it then the packet will eventually reach its destination. The ability to make mistakes lets routers use a whole range of heuristics to trade off between having accurate routing information for addresses and being able to quickly push through packets.
These heuristics are unavailable for multicast addresses. The reason is that no one actually knows every destination that a multicast address is to reach. Because the members of a multicast group are potentially very large and are constantly changing multicast packet routing strategies depend on dynamic discovery of multicast group membership.
Imagine we have the following network:
M1 wants to send out a multicast to M3 but not M4. For this to work M2 must remember M1's multicast address and record the fact that it should forward the multicast packet to M3 but not M4. If M4 then creates a multicast that is to go to M1 but not M3 then M2 has to create another memory location recording the fact that M4's multicast messages go to M1 but not M3. Each time a new multicast address is introduced all the routers on the distribution path have to set aside memory to record it for as long as the multicast lasts. Two multicast sessions means two chunks of memory.
Let's say that M2 crashes. In that case when M1 sends out a multicast packet how will M2 know where to send it? Either M2 can send it no where saying "No one has told me they are interested in this" or M2 can send it everywhere asking "Is anyone interested in this." In the first case M3 and M4 will miss multicast messages and in the second case M2 will end up flooding the entire network. In either case, the situation is a bad one. The point is, that unlike point-to-point routers who can use various heuristics and other tricks to know where a packet can go and can even be wrong without doing too much damage, a router that forgets its multicast routings will either cause data to mysteriously disappear (triggering expensive rediscovery algorithms) or will have to flood the network. In other words, multicast routers can't make mistakes so they can't forget or play the other tricks that point-to-point routers can.
One way to fix this problem would be to put the address of every destination on each multicast packet but obviously this doesn't scale terribly well as membership will tend to grow exponentially. There are other strategies that depend on grouping destinations together and just putting these group identifiers on the packets but these solutions have their own instabilities because one has to constantly replace the group destinations with more specific destinations and if someone screws up an address or looses a group there is no mechanism to recover.
MBONE is an IP multicast based effort that has been around for, at this point, probably more than a decade. The reason MBONE works is because all the members carefully configure their routers for optimal paths and limit the number of transmissions. MBONE receives a lot of care and feeding. If we are willing to provide an equivalent level of care and feeding for each and every multicast group then yes one could use multicast across the open Internet but at that level of care and feeding many of multicasting's compelling attributes would be lost.
Reliable multicasting is an effort to try and deal with the previously mentioned instabilities. The basic feature of reliable multicasting is some sort of confirmation that a packet has been received. Some strategies send out messages to confirm delivery and others send out messages when there is a suspicion or knowledge that a packet hasn't been delivered but in either case a lot of communication is happening. There is actually a third strategy based on sending extra data to allow for recovery but even these strategies usually end up needing to have some sort of ack/nack backup. These strategies do improve the situation, especially when they are smart about using local members of the group to recover data from rather than going back to some root node but I question their utility over application layer based efforts like HRME that I refer to below. If they try to build in all the error correcting functionality into routers then routers will end up looking more like application servers then super fast (and super dumb) members of the Internet infrastructure. I suspect they will eventually find themselves evolving into a HRME like strategy.
So, to summarize, IP multicasting across the open Internet requires linear memory growth to deal with a exponentially growing number of multicasts. This situation is quite tractable within an Intranet or other limited area but on the open Internet it just won't work.
There is, however, an alternative strategy that provides many of the benefits of IP multicasting without the downsides. This is what I call the Host Routing Multicast Engine (HRME). The basic idea is to create a spanning tree above the IP level. This leaves IP routers to do what they do best, route packets quickly. The knowledge of the multicasting group's membership is recorded and maintained at the application level using standard spanning tree strategies so there is no centralized global state. In essence HRME is just taking reliable multicast strategies and running then on the end point servers instead of the routers. This is inherently less efficient then running them on routers but on the other hand, they leave routers to do super fast routing and leave the application intelligence on the edges of the network. This seems an appropriate trade off.