Serval is a project that wants to enable mobile phones to work no matter what. They have built mesh technology to let mobiles make voice calls as well as share data and have an app available on Android to use this technology. This is a technology that Thali could potentially really leverage. In the first section below I give a quick walk through of Serval. In the next section I compare and contrast Serval and Thali’s ways of solving similar problems. Then I conclude that hopefully we can reach a point where Thali just runs on top of Serval.
1 Defining Serval
My understanding is that Serval’s goal is to enable smart phones to work without formal infrastructure. Either because such infrastructure was too expensive (e.g. local people didn’t have enough money to afford the official cell phone infrastructure or it was too expensive to build) or because it simply wasn’t available (e.g. after a disaster).
To that end Serval has built a GPL licensed open source project whose goal is to provide voice and data services over cell phones using Bluetooth and WiFi. It supports a mesh based protocol to create ad-hoc networks with “low latency” meshing (e.g. relaying packets real time) as well as a higher level service called Rhizome to provide store and forward style meshing.
Below I pick a few of the protocols supported by Serval that I found most interesting just to give the reader a sense of what Serval can do.
1.1 Mesh Datagram Protocol (MDP)
MDP is a mesh protocol designed to create what we call “low latency” meshes. These are meshes where different devices expect to be able to communicate in near real time. The classic scenario is that Joe wants to talk to Jane but isn’t in radio range. So instead Joe sends his message to Jack who is within range of Joe and Jane and Jack relays it to Jane.
MDP is not an IP based protocol. Instead every node has an address called a Serval ID (SID) which is its 256 bit Elliptical Curve public key. MDP is a lot like UDP in that it doesn’t guarantee ordering and duplicates are possible. The signing and encryption of packets is built directly into the MDP packet structure. This is sort of the equivalent of DTLS (well, o.k., kinda).
The job of the MDP infrastructure is to figure out a path between a device that sent a packet and the device that is intended to consume that packet. This can involve re-transmissions (especially on lossy connections).
1.2 Mesh Stream Protocol (MSP)
At the risk of having the Serval developers throw something at me I think one can fairly think of MSP as being Serval’s equivalent of TCP. That is, it creates a reliable ordered stream of packets that instead of running on IP instead runs on MDP.
1.3 Distributed Numbering Architecture (DNA)
Serval wants to make it as easy as possible for people to use it and one of the ways they do that is by trying to use a phone’s normal phone number as an addressing mechanism even when there is no cell infrastructure available. But the “real” addresses as defined in the MDP section are actually public keys. So a mapping mechanism is needed, that’s DNA. DNA lets people claim numbers and uses MDP in broadcast mode to send out the moral equivalent of an ARP packet looking for anyone who claims the number. So if Joe wants to call Jack’s number then Joe’s phone would send out a DNA broadcast looking for Jack’s number and anyone who responds would send over their public key.
Serval does have some mechanisms to validate someone’s ownership of a phone number but I honestly don’t know enough about how cell phones work to fully appreciate how those mechanisms would work in an offline context.
1.4 Voice over Mesh Protocol (VoMP)
VoMP is a voice oriented protocol natively designed to run over MDP and use DNA. So encryption, discovery, binding, etc. is all handled at lower layers. It’s job is to give a good voice stream over the mesh network.
1.5 Rhizome
When dealing with content that doesn’t need to be live streamed (like voice), Serval uses Rhizome. Rhizome is a distributed storage service that runs on top of the Serval mesh. Each device has its own Rhizome store, called the Rhizome database. Every time two Serval nodes communicate they do a Rhizome synchronization.
Rhizome’s data model is based on the concept of a bundle which is an indivisible chunk of content that can be of any size. It contains a manifest and a payload.
When a bundle is created the creator will generate a EC private key associated with that bundle and will then use the public key as the ID for that bundle. The creator will also generate a version number for the bundle and will then sign the bundle’s content and version number with the private key.
When two devices synchronize they look to see what bundle IDs the other has that they do not and when they both have the same ID, which one has the higher version. Higher versions always replace lower ones.
Rhizome is essentially a flood based architecture that creates a shared universal database of bundles.
1.6 Cooee Service Discovery Protocol
This is Serval’s answer to mDNS or SSDP. Cooee allows devices to broadcast requests to find other services. Each service is described with a set of name/value pairs and a service discovery request contains patterns to match against those name/value pairs. Any device which has a service that matches a request broadcast will respond with service location data.
1.7 Serval Internet Location Service (SILS)
SILS maps SIDs to IP addresses and Ports. These mappings allow Serval’s protocol to extend over the Internet.
1.8 Tunneling
Serval supports tunneling TCP over MSP. The MSP tunnel is directional so that a tunnel has a client and server and the client can open connections to the server but not the other way around. Each TCP connection is turned into a MSP connection.
2 What should the relationship be between Serval and Thali?
Thali is about the peer to peer web. We want to enable people to run their own services on their own devices. We are starting with smart phones (specifically Android and iOS) but we fully intend to go everywhere from Raspberry PI style devices to tablets to laptops to desktops to the cloud. We have strongly opinionated views on what technologies we want to use to build our implementations of the peer to peer web. Our server is Node.js. Our transport security protocol is TLS. Our application protocol is HTTP. Our synchronization protocol is CouchDB. Our Internet discovery and connection solution is Tor Hidden Services.
But along the way we ended up having to go into a hybrid mode where we could run on the mainline Internet as well as in places there isn’t any Internet. When there is Internet (e.g. Wifi Infrastructure Mode or Cell) we just use it. But when the Internet is gone we have written code to use BLE/Bluetooth on Android and Multi-Peer Connectivity Framework on iOS to allow us to move IP connections. We also had to create our own discovery infrastructure. We also expect to eventually add in support for WiFi Hotspots.
That all having been said, Thali and Serval share more in common than they have differences. Below I walk through comparisons of how Serval and Thali do things.
2.1 SIDs and Thali Public Keys
Serval and Thali both solved the same problem in the same way. We both needed a way to let users create and own their own identities without any centralized infrastructure and we both used EC public keys to do it. I’m too lazy to check if the curves are the same but even if they aren’t we can always map and bridge.
2.2 MDP/MSP vs IP
Serval is not IP based, Thali is exclusively IP based. But in practice it turns out this really and truly just plain doesn’t matter. On the Thali side the work we have done to let us use the multi-peer connectivity framework and Bluetooth means it would be really easy for us to consume MSP directly if we wanted to. On the Serval side they have already created a bridge between MSP and TCP. So we can skin this particular cat in at least two different ways.
2.3 MDP Encryption vs TLS
MDP has built into it the ability to both encrypt and sign packets. Thali on the other hand has standardized on TLS. In practice though this should just mean that we won’t encrypt/sign our MDP packets in favor of using TLS. The reason, btw, for standardizing on TLS is that we want to use the same security architecture regardless of the underlying transport. For example, we use TLS over Bluetooth and the multi-peer connectivity framework. This lets us make our system more secure by having exactly one security infrastructure that works exactly one way regardless of the underlying transport. But again, this shouldn’t be a problem with Serval.
2.4 Rhizome and CouchDB
Rhizome and CouchDB are designed to solve different problems and so I don’t see them as competitors but as complements.
Rhizome is designed to provide a shared public board space where people can post up whatever they want and it will be shared around.
CouchDB is about providing a database semantic with collaboration capabilities.
Now, to be completely fair, it’s all software and we certainly can beat Rhizome into doing what CouchDB can do but I don’t think that is really useful. So we should just use both.
What’s also interesting is that Thali already needs Rhizome. We have scenarios where user A really needs to get data to user B but user B isn’t around. So what we want to do is to serialize out the changes in CouchDB to a file, sign and encrypt it and then blast it out to anyone they can in the hope that eventually it will get to user B. Rhizome would be perfect for this.
2.5 Cooee vs Thali Discovery
Serval currently seems to have two kinds of discovery.
In “low latency” mesh mode user A wants to talk to user B immediately. So they can just try to send out a MDP request with the public key of user B and if it arrives, great, if not, not. This is the telephone model if you will, type in the number and you get an answer or you don’t.
The other model is Cooee where a device needs a service and tries to see who is around with that service.
Thali however has a third kind of discovery. The source of Thali’s discovery is based on Thali’s very high bar for privacy. We are not interested in creating a user tracking service. We absolutely don’t want someone able to discover a user’s location unless the user wants that location discovered.
On the Internet we handle this via Tor Hidden Services. Anyone who wants to communicate with a user can walk up to Tor and try to connect to their key but Tor let’s the user decide if they want the connection and it protects their location. With Tor Hidden Services knowing someone’s key lets you talk to them but it doesn’t let you find their actual location.
When we use local radios we use a beacon based discovery system where each beacon identifies who the person is looking for. But the only people who can read a beacon are either the sender or the intended receiver, it’s gibberish to everyone else. This allows us to do discovery without exposing someone’s identity to everyone around.
Putting this in Serval terms, if Thali used Serval we would constantly be rotating the devices SID so that the user can’t be tracked from the mesh routing data.
So to make Thali work with the same privacy bar on Serval we would need to introduce a higher level service that can let people translate from someone’s long term public key to their current SID in a trusted way. But I think we could build this using some combination of Cooee and Rhizome.
2.6 DNA vs Identity Exchange
The problem that all public key based systems have is - how do you exchange public keys? It’s not like an email address or a phone number which are short enough to be exchanged verbally. This goes to Zooko’s triangle which says that if you are going to have a system with secure and decentralized names then they can’t be human meaningful. Serval gets around this issue to some extent by binding people’s phone numbers to their public key.
In Thali though we don’t rely on any centralized system (it would rather defeat the point) like phone numbers. So we have had to introduce our own identity exchange system. But implementing this over Serval would be trivial. It mostly would involve whatever we do to bridge discovery. The rest could be handled with our existing HTTP based solution.
2.7 GPL vs MIT
Serval is licensed primarily under GPL while Thali is under MIT. So while Serval’s GPL community can use Thali (GPL can use MIT), the opposite is not necessarily the case.
The situation may not be as dire though as it appears. GPL doesn’t apply across process boundaries. So we could stand Serval up as a stand alone process and then use its TCP bridge. I’m not sure exactly how we would hook into discovery but my guess is that adding a TCP interface we can listen to wouldn’t be a big deal.
Still, it would be much easier if Serval was dual licensed under MIT.
3 Conclusion
Serval and Thali have complementary goals. Serval is about enabling mobile devices to communicate without central infrastructure. Thali wants to build the peer to peer web. One of the places we want to run the peer to peer web is over mobile devices both with and without Internet connectivity. This isn’t the only place but it is a very important one. So in an ideal world we would just use Serval.
My plan is to talk with Paul Gardner-Stephen (the head of Serval) and get into more details than the Serval docs gave me. I’m especially interested in learning about their plans for iOS, how they operate without WiFi extenders, how they handle discovery (I see them talk about Bluetooth but not BLE which matters from a battery perspective), how they think about hybrid scenarios where the same device running the same software will be in environments with Internet connectivity and in environments without it. The devil is always in the details.
But Serval wants to be in the radio business, Thali doesn’t. So we have a motivation to use Serval if it can meet our needs.
I have been following Serval and Thali too.
What was the outcome of the discussions?
Is there a plan to find out how to utilize the best of each?
Right now Thali has to figure out how to get off JXcore and how to get radios in Android working properly. Until we figure that out moving forward on Serval isn’t going to happen.