As part of my job I am thinking deep thoughts about what protocols Windows Live should expose and right now I'm pushing hard for JSON to be a premier protocol data format. I like JSON because it makes it extremely easy to persist hierarchical data structures which account for the bulk of the messages that Windows Live needs to move around. But JSON does have a number of issues that I think need to be addressed, specifically: namespaces, extensibility, schema/type support and relative linking. In this article I make a proposal for how to address namespaces. I will address the other issues in future articles.


The Problem

If two groups both create a name "firstName" and each gives it a different syntax and semantics how is someone handed a JSON document supposed to know which group's syntax/semantics to apply? In some cases there might be enough context (e.g. the data was retrieved from one of the group's servers) to disambiguate the situation but it is increasingly common for distributed services to be created where the original source of some piece of information can trivially be lost somewhere down the processing chain. It therefore would be extremely useful for JSON documents to be 'self describing' in the sense that one can look at any name in a JSON document in isolation and have some reasonable hope of determining if that particular name represents the syntax and semantics one is expecting.

The Proposed Solution

It is proposed that JSON names be defined as having two parts, a namespace name and a local name. The two are combined as namespace name + "." + local name to form a fully qualified JSON name. Namespace names MAY contain the "." character. Local names MUST NOT contain the "." character. Namespace names MUST consist of the reverse listing of subdomains in a fully qualified DNS name. E.g. org.goland or com.example.bigfatorg.definition.

To enable space savings and to increase both the readability and write-ability of JSON a JSON name MAY omit its namespace name along with the "." character that concatenated it to its local name. In this case the namespace of the name is logically set to the namespace of the name's parent object. E.g.

{ "org.goland.schemas.projectFoo.specProposal" :
"title": "JSON Extensions",
"author": { "firstName": "Yaron",
"com.example.schemas.middleName":"Y",
"org.goland.schemas.projectFoo.lastName": "Goland",
}
}

In the previous example the name firstName, because it lacks a namespace takes on its parent object's namespace. That parent is author which also lacks a namespace so recursively author looks to its parent specProposal which does have a namespace, org.goland.schemas.projectFoo. middleName introduces a new namespace "com.example.schemas", if the value was an object then the names in that object would inherit the com.example.schemas namespace. Because the use of the compression mechanism is optional the lastName value can be fully qualified even though it shares the same namespace as its parent. com.example.taxonomy

If the name of the root object in a JSON structure is not fully qualified then the names contained in that JSON structure MUST NOT be treated as being compliant with this specification. Note however that the presence of a fully qualified name is not sufficient to determine that a JSON structure is compliant with this proposal as it is legal to have names that include the "." character in JSON. To be sure a JSON structure is compliant one needs out of band information.

Q&A

Isn't this proposal incompatible with existing JSON systems?

Since this proposal doesn't change JSON's syntax any JSON object generated in compliance with this proposal will be processable by any existing JSON processor. Even the introduction of namespaces is not, in itself, a big deal as JSON currently says nothing about the semantics of names so sprinkling in "."s doesn't change things. What does change things however is the compression mechanism. An existing JSON processor would reasonably see "org.goland.firstName" and "firstName" as being unrelated names. But with this proposal their relationship would be defined by their relative positions in the object structure. This isn't something an existing JSON processor would know how to address. The practical ramification of this is that when the JSON processor translates the JSON structure into a programming language it won't output fully qualified names and so could cause a real mess.

Given the compatibility issue is it appropriate for systems compliant with this proposal to use the application/json MIME type?

The application/json MIME type is defined in RFC 4627. MIME types have traditionally focused on syntax, not semantics, so it's reasonable to argue that application/json is appropriate for use with this proposal since the proposal changed the JSON syntax. Although at a minimum it would seem reasonable to extend RFC 4627 to include an optional parameter indicating compliance with this extension. In any case I'm open to ideas.

Why not allow for relative namespace names?

{ "com.example.something":
".foobar.somethingelse": "Isn't this neat?"
}

One could argue that ".foobar.somethingelse" should, because it starts with a ".", be treated as a relative namespace and therefore its full namespace would be "com.example.foobar". This seemed to me to be too clever by half and so I decided not to do it.

Why not use namespace prefixes ala XML?

To use namespace prefixes we would have to add in an object whose only purpose was to define the prefixes. Then we would have to create a bogus root to contain that object. E.g.

{ bogusroot:
"json.namespacePrefixDefinitions":
{ "G":"http:\\/\\/goland.org\\/schemas\\/projectfoo",
"E":"http:\\/\\/example.com\\/schemas },
"realroot":
{ "G.specProposal" :
{ "title": "JSON Extensions",
"author": { "firstName": "Yaron",
"E.middleName":"Y",
"G.lastName": "Goland"
}
}
}
}

I used the "." character instead of the ":" character to separate the prefix from its local name because I think that's more readable in JSON. But in any case note the nastiness involved in using prefixes. The reason this doesn't seem as nasty in XML is because XML has those horrific violators of data consistency – attributes. Since no such animals (thankfully) exist in JSON we have to do violence to the object model in order to enable prefixes. Hence my rejection of prefixes.

Why use DNS when you could have used URLS?

I suppose the weasel answer is that "/" is illegal unescaped in a JSON string so you would end up with:

{ "http:\\/\\/goland.org\\/schemas\\/projectfoo\\/specProposal" :
"title": "JSON Extensions",
"author": { "firstName": "Yaron",
"http:\\/\\/example.com\\/shemas\\/middleName":"Y",
"http:\\/\\/goland.org\\/schemas\\/projectfoo\\/lastName": "Goland"
}
}

Which I personally think is just plain ugly. But I see nothing sacred in the JSON format and have given serious thought to proposing an alternative because I always get confused where to put curly brackets versus commas and I'm concerned that there is no way to annotate values in a string. In such an alternative I could make it legal to use "/" characters in names.

But in this case the real reason I avoided URLs is simplicity. I've never bought into the idea that a namespace name should necessarily be resolvable and I find reverse DNS names to be easier to deal with. Just personal taste I suppose. Apparently those years in Java land rubbed off on me.

Do we really need the compression mechanism?

I would dearly love to get rid of the compression mechanism because it complicates the processing model, makes it harder to pull objects out of a JSON structure, etc. But I'm deeply concerned that requiring each and every name to be fully qualified will make JSON both unreadable and unwriteable. E.g.:

{ "org.goland.schemas.projectfoo.specProposal" :
"org.goland.schemas.projectfoo.title": "JSON Extensions",
"org.goland.schemas.projectfoo.author": { "org.goland.schemas.projectfoo.firstName": "Yaron",
"com.example.schemas.middleName": "Y",
"org.goland.schemas.projectfoo.lastName": "Goland"
}
}

Um…. yuck. And that's without even discussing the byte bloat.

How can I define a name that has no namespace?

The proposal doesn't allow for that. I think that allowing for a mix of non-namespace qualified names and namespace qualified names just adds a lot of complexity for zero benefit so I require that all names be namespace qualified in order to be compliant with this proposal.