APP and Dare, the sitting duck
Filed under: SOA/Web/Etc.
So poor Dare made the apparently unforgivable mistake of questioning anything about APP. First he suggested that maybe APP doesn't solve all the world's problems. Then he clarified that GData isn't APP. And then after a particularly appalling article by Mr. Bray that was so rude that I refuse to contribute to its popularity by linking to it, Date finally tried to explain that Microsoft isn't trying to destroy APP. I'm going to ignore all the heat because I suspect my handful of readers have already read Dare's articles and the various responses. Instead I'll try to explain what's actually going on at Live. I know what's going on because my job for little over the last year has been to work with Live groups designing our platform strategy. So I know where the bodies are buried and in many cases helped to bury them.
Live's goals in the protocol space
Most of the services in Live land follow a very similar design pattern, what I originally called S3C which stood for Structured data, with some kind of Schema (in the general sense, I don't mean XML Schema), with some kind of Search and usually manipulated with operations that look rather CRUD like. So it seemed fairly natural to figure out how to unify access to those services with a single protocol. The first place we went was APP. This was a business decision. Live is in the services business, not the protocol or development tool business. I am not paid to give a damn what browser someone talks to Live with, what language our partners/customers develop their software in, what operating system they run, etc. The only thing I am paid to care about is that we get as many people as possible writing as much software as possible that interacts with Live services.
So for us the whole protocol issue is just a barrier to entry. We don't make any money from that barrier. And no, we don't care about lock in. At least the people who give me orders (hey George!) have a good understanding that the days of lock in are long over. The future clearly belongs to connecting services together. That is, nobody is going to have all of their data, services, etc. at Live. It won't happen. We could be the absolute best at everything we do and we still won't own all of a user's data and services. So for us to succeed we have to convince users to keep some of their data/services with us and then make it brain dead easy to connect the data/services they keep with us to all the other data/services they keep in lots of other places.
In other words, it's all about interoperability and the easier it is to interoperate the more successful we will be. So in our dream world there would exist a protocol that would meet the S3C pattern. A popular protocol. A widely support supported. A protocol we could just adopt and get on to the business that can make us money – building cool services.
So with this in mind we first went to APP. It's the hottest thing around. Yahoo, Google, etc. everyone loves it. And as Dare pointed out in his last article Microsoft has adopted it and will continue to adopt it where it makes sense. There was only one problem – we couldn't make APP work in any sane way for our scenarios. In fact, after looking around for a bit, we couldn't find any protocol that really did what we needed.
Because my boss hated the name S3C we renamed the spec Web3S and that's the name we published it under. The very first section of the spec explains our requirements. I also published a FAQ that explains the design rationale for Web3S. And sure enough, the very first question, 2.1, explains why we didn't use ATOM.
Below I try to respond to some of the points made about Dare's article (at least the polite ones).
The general conclusion of the anti-Darians is to use 'gloving'. The idea that you put a link in the ATOM feed to the actual object. This isn't a bad idea if the goal was to publish notices about data. E.g. if I wanted to have a feed that published information about changes to my address book then having a link to the actual address book data in the APP entries is fine and dandy. But if the goal is to directly manipulate the address book's contents then having to first download the feed, pull out the URLs for the entries and then retrieve each and every one of those URLs in separate requests in order to pull together all the address book data is unacceptable from both an implementation simplicity and performance perspective. We need a way where by someone can get all the content in the address book at once. Also, each of our contacts, for example, are actually quite large trees. So the problem recurses. We need a way to get all the data in one contact at a go without having to necessarily pull down the entire address book. At the next level we need a way to get all the phone numbers for a single contact without having to download the entire contact and so on. What this all boils down to is that we need a protocol that natively understands and can interact with the hierarchy of our data. That isn't what APP does. That doesn't mean APP is wrong or bad. It just means that APP isn't optimized to provide access at any arbitrary point in a tree of data.
Use optimistic concurrency
Replacement based semantics are a problem because of versioning issues. We are constantly changing our schemas in backwards compatible ways so we need clients to know how to deal with data they don't recognize.
APP's approach to this problem is to have the client download all the content, change the stuff they understand and then upload all the content including stuff they don't understand. I believe that one of the most important innovations in HTTP was explicitly distinguishing between safe and unsafe methods. GET is safe, one can execute GETs against any server for any reasons and not be held responsible for any side effects. PUT, on the other hand, is unsafe. The consequences of a PUT is the caller's direct, personal responsibility. But that responsibility goes out the door as soon as one introduces a paradigm where clients are expected to blindly upload content they don't understand. In that case there is no responsibility. I believe this approach goes too far in bluring the distinction between safe and unsafe methods.
On a practical level though the 'download then upload what you don't understand' approach is complicated. To make it work at all one has to use optimistic concurrency. For example, let's say I just want to change the first name of a contact and I want to use last update wins semantics. E.g. I don't want to use optimistic concurrency. But when I download the contact I get a first name and a last name. I don't care about the last name. I just want to change the first name. But since I don't have merge semantics I am forced to upload the entire record including both first name and last name. If someone changed the last name on the contact after I downloaded but before I uploaded I don't want to lose that change since I only want to change the first name. So I am forced to get an etag and then do an if-match and if the if-match fails then I have to download again and try again with a new etag. Besides creating race conditions I have to take on a whole bunch of extra complexity when all I wanted in the first place was just to do a 'last update wins' update of the first name. In a merge scenario the previous is trivial. I would just upload the first name.
A number of folks seem to agree that merge makes sense but they suggested that instead of using PUT we should use PATCH. Currently we use PUT with a specific content type (application/Web3S+xml). If you execute a PUT against a Web3S resources with that specific content-type then we will interpret the content using merge semantics. In other words by default PUT has replacement semantics unless you use our specific content-type on a Web3S resource. Should we use PATCH? I don't think so but I'm flexible on the topic. If Web3S proves to matter then it will inevitably end up at a standards body and if that standards body says use PATCH not PUT then we'll use PATCH, not PUT.
Why not just modify APP?
We considered this option but the changes needed to make APP work for our scenarios were so fundamental that it wasn't clear if the resulting protocol would still be APP. The core of ATOM is the feed/entry model. But that model is what causes us our problems. If we change the data model are we still dealing with the same protocol? I also have to admit that I was deathly afraid of the political implications of Microsoft messing around with APP. I suspect Mr. Bray's comments would be taken as a pleasant walk in the park compared to the kind of pummeling Microsoft would receive if it touched one hair on APP's head.
Inventing our own protocol wasn't an easy choice. It raised the very barrier to entry we are trying to avoid. But we couldn't figure out how to use APP without putting an unacceptable implementation and performance burden on both our customers and ourselves. So we felt we had no other option. But Web3S is a tactical not a strategic choice. If it turns out there is a better option then we'll adopt it.