Mark Nottingham's article on XML Schema and Web Services is well worth reading. Of course I say that because I mostly agree with him. What I don't agree with Mark on is that RDF is the right way to go, but I am beginning to wonder if perhaps SDO is.

I don't think the world is ready for graph based data formats to be our lingua franca. I see matters as a progression from linear (Ascii/Unicode) to tree (XML) to graph (?). Seeing how much trouble average programmers have coming up with reasonable tree structured data I don't believe the industry has the right design practices to make good use of graphs.

Of course arguments based on the perceived psychology of programmers are inherently weak (who has the money to test them?) and so it wouldn't shock me to find out I'm wrong. But at the end of the day I have to make bets, usually with few solid facts, on where I think technology is going and my current bet is that trees are where it's going to be at for the next few years.

Still, I like to hedge my bets. Ideally I would like a format that can robustly handle trees but can also provide graph capabilities. It is this desire to hedge that attracts me to SDO, besides the fact that I am (still) a BEA employee.

SDO is a data object model that sits on top of other data object models like XML Schema or even RDF. SDO's core model is based on containment. E.g. one thing is contained by another. All objects in a SDO object model must be contained by some other object so SDO objects are always proper trees.[1] But in addition to containment the SDO object model also supports linking. Objects can point to each other without having a containment relationship. This enables SDO to model graphs as well as trees. SDO is still primary tree based and I suspect most programmers will find it easiest to just use SDO as a tree model but if they want graph capabilities they are there.

Another advantage of SDO is that it can support XML Schema without overloading the programmer with XML Schema's features. SDO has its own data model that is a subset of XML Schema's. In a very real sense SDO is a profile of Schema that pulls out a lot of the useless gunk from XML Schema. To be fair, however, in some ways SDO's model is more powerful than XML Schema's since it can express graph structures and I hope it will have proper support for an open content model, capabilities XML Schema doesn't really provide.

What makes SDO particularly interesting is that it does not have its own schema persistence format. If one wants to persist a SDO data description then one has to use a language like XML Schema or Relax NG. This makes things particularly interesting if one creates a SDO structure that contains a graph or is based on an open content model. But these are not new problems so SDO is not making the current situation any worse. People already try to occasionally shoe horn graphs in XML Schema (IDRef) as well as open content models (two pass validation, validation by projection, etc.).

With some smart politicking I think there is a good chance that the Java world could come to consensus on SDO. Such a consensus would give Java programmers a powerful yet simple model with which to express their data structures that will be fully compatible with any system that supports XML Schema. If this happens then I would expect the SDO model, which was adapted to rather than based on Java, to spread beyond the Java world.

All in all I think SDO is a great way to maintain current investments, make schema accessible to the majority of programmers and hedge our bets on the tree/graph debate.

[1] It is possible to create objects that aren't contained by anything but in most cases one is required to clean up one's graph and make sure there are proper containment relationships before sharing the graph with others.