Making HTML5 peer to peer web friendly
Friday August 30th 2013, 1:57 pm
Filed under: SOA/Web/Etc.
Filed under: SOA/Web/Etc.
HTML5 is built on the assumption of a client/server web. Below I walk through the issues this raises for the peer to peer web. The good news is that we really don’t need terribly many changes to HTML5 to make it peer to peer friendly. Basically we need a new same origin policy that is based on certs rather than hosts, a way to handle mutual auth requests, standardized support for node.js (or equivalent) and a few other minor things.
1 How the peer to peer web is different than the client/server web
HTML5 is based on a client/server view of the web. The web itself, e.g. HTML/HTTP doesn’t have these assumptions built into it. But HTML5 does. Below I explore those assumptions and why they are a problem for the peer to peer web.
Servers are identified by DNS In HTML5 land if you run a web server then it has a DNS address. You can try using an IP address but in practice nobody does since so much depends on DNS redirects, load balancing, etc. In the peer to peer web we don’t require people to pay money to play and we don’t depend on centralized infrastructures (aka one stop security holes, and no DNSSEC does not fix this since how do you trust the people with the DNSSEC keys?). So peer to peer communications will not depend on DNS.
Secure communications go over HTTPS to servers with CA rooted certs The core of the HTML5 network security model is HTTPS with the assumption that servers are identified by DNS and that the server will present a x509 cert that refers to that DNS name and roots to a certificate authority (CA) who is in the browser’s root store. As already mentioned the peer to peer web doesn’t support DNS names. But we also don’t support requiring people to pay money to a CA to get a rooted cert. Even more importantly we don’t support the CA model as used in browsers. That model essentially says that any CA (or anyone who the CA vouches for) can masquerade as literally any website on the Internet. This is a completely insane ’security’ model. So no CA cert requirements for the peer to peer web.
Clients don’t use mutual authentication with TLS HTML5 simply doesn’t have the concept that a client might authenticate themselves via TLS mutual auth. But the peer to peer web by its nature is symmetric so if we authenticate one side with a cert we will authenticate the other side of the communication with a cert.
Clients aren’t servers HTML5 doesn’t (currently) countenance the idea that there are peers and as peers they are simultaneously a client and a server. Yes, there are full duplex transports like Web Sockets but there is no way in HTML5 for someone to set up a Web Socket listener much less a HTTP listener.
Applications come from a single server HTML5 has tons of goodies to make it possible to write web applications that run completely locally. But these models assume that the pages in the application are loaded and updated from a single server location (identified via DNS) on the Internet. In a peer to peer infrastructure it often makes sense to get a copy of an application from a local peer rather than from some random server and at best validate a signature to make sure the copy is right. Reasons for operating this way include efficiency but also security. At some point I’ll need to write an article explaining why distributed distribution is much better than centralized distribution but that’s another story. Unfortunately there is no standard way to hand off a web application as a ’unit’, dump it in the local file system and run it in HTML5.
2 Making the Same Origin Policy peer to peer web friendly
Imagine that a web page is loaded and it creates an indexed database instance. Now imagine another web page is loaded, how does the browser decide if the second web page should be able to access the indexed database instance created by the first web page? The answer is the same origin policy. The browser checks the URL of the first and second web pages and if their URLs have the same scheme, host and port then both can access the same indexed database instance. The same origin policy is the core security mechanism for just about everything in HTML5.
This policy blows up for the peer to peer web because in the peer to peer web peers are identified by their keys, not by their network location. Peers move around, their IP addresses change as they go from home to work to the coffee store and everywhere in between. They don’t have DNS names. They don’t even have consistent access mechanisms. For example, one might send an email to a peer via a mix network. One might send an IM to a peer via an onion network. And one might have a video chat with a peer via a point to point connection. In all cases the peer has the same identity and should have the right to access the same HTML5 resources but the URIs used to communicate with the peer will all be different.
One way to work around this is to introduce a new URI type that includes a hash of the peer’s key (or cert or whatever) as the host portion. But I am firmly convinced this is actually a really bad idea. The reason is that essentially what we would be doing is creating a URI wrapper for URIs. Imagine we specify a URI of the form:
peertopeer://[type of hash]/[hash of peer’s cert]/[transport mechanism]/[transport mechanism specific arguments]
So using our previous example we might use the following three URIs:
Notice what just happened. What we have here are what are really three totally different transports with three totally different sets of arguments but we have shoe horned them into a single URI scheme just to get around the same origin policy. This is the sort of thing that isn’t going to end well. It screws up URL handling infrastructure. It confuses URL syntax. It essentially breaks the URL model by creating one URL to rule them all. This is a bad idea.
A much better idea is to introduce a parallel same origin policy that is explicitly based on a cert hash and provide a mechanism to register new URL types and to specify which same origin policy applies to that URL, the current scheme/host/port or the cert approach.
3 Introducing a new HTTPS URL type - httpscert
The security model for the HTTPS URL type is fundamentally based on the global CA model. To fix this for point to point connections we need a new type of HTTPS URL, maybe HTTPSCERT (don’t worry, it’s not something users would ever type in). It would be exactly the same as the existing HTTPS URL except that the scheme would be HTTPSCERT and before the host part would be a hash type and value. E.g. something like httpscert://123456/sha256/10.0.1.1/some/path. DNS names could also be used if desired. But this URL type would be registered as using a hash based origin policy.
4 Cert chains and key roll over
For security reasons it makes sense that a peer might create a cert with a particular hash but not want to use it on a daily basis to sign/encrypt things. Instead it would create a subsidiary cert that it would then chain to its main cert. To make this work we would specify that when connecting to a server via TLS using a cert hash the server is required to present a cert that either has the specified hash or chains to a cert with the specified hash.
Another issue is key roll over. Even using chaining the main cert will eventually need to be rolled over, if only because its key size might no longer be long enough. Personally my thinking is that there should be an extension to the X.509 cert syntax to enable a cert to identify itself as the roll over of another cert along with a link to a file signed by the previous cert validating the roll over. From a HTML5 perspective this means that the cert based same origin policy needs to have a mechanism to detect these roll overs and substitute the new cert hash for the old one. This also means the URL resolvers need to detect and handle this situation correctly, e.g. not fail when the hash doesn’t match but rather detect the fail over, confirm it and then switch to the new cert.
5 Supporting TLS Mutual Auth & Dynamic Identities
The previous explains how we can use certs directly to authenticate servers, but what about authenticating clients? In a peer to peer system authentication is symmetric so using TLS mutual auth is a very natural solution. It’s certs all the way down. But making this work requires a few changes.
First, XmlHTTPRequest needs to be augmented to allow for the submission of a client cert.
We can also see the use of something like the HTML5 form feature keygen to allow either the programmatic creation of a new client cert or the creation of a new client cert via the UX. The idea would be that the user would say “Use a new identity” with the website so that the user can establish an authenticated relationship with the website without exposing their ’primary’ identity.
6 Servers are people too
It’s really about time that node.js or some equivalent be part of HTML5. We also need the ability to bind a server to whatever URL or URLs we want since the peer to peer web expects to use at least the three types of transports already mentioned (point to point, onion and mix) and will be experimenting with different forms of these transports. There also needs to be some kind of life cycle model to allow for essentially permanent servers to be run in the background. We also need a standard way (a la Web Messaging/Web Workers) to let web pages securely talk to the node.js server code.
7 A standard for portable applications
In a peer to peer world one is likely to run apps completely locally since one’s peers are typically devices that really don’t want to serve the world. The apps being passed around are just that apps, not web sites. So we need a way to just hand over the whole package. What’s really needed are two things. A standard for putting an application into a ZIP file with a file structure and an extension to the AppCache manifest so it can be loaded and run from inside of that ZIP file. Something like the HTML5 proposed (but not widely supported) FileSystem API could be useful here.