Bearer Tokens, Discovery and OAuth 2.0
Filed under: SOA/Web/Etc.
Part of my day job is working on adding discovery to OAuth 2.0. This article provides a summary of some of that work. So I was more than a little concerned when I saw a blog article from Eran Hammer-Lahav, the editor of OAuth 2.0, asserting that OAuth 2.0 couldn't support secure discovery. Very worried that something was terribly wrong I carefully read Eran's article. I summarize below what I believe his concerns are and explain how I believe those concerns would be addressed by extensions to OAuth 2.0 to support discovery. I also explain how Eran's article helped me find a flaw in my own proposal and how I propose fixing that flaw.
1 The replay attack - advertising the wrong token endpoint but the right protected resource
The first part of Eran's article deals with a general critique of bearer tokens. The issues raised are all well known and equally well understood to not apply to the scenarios that core OAuth 2.0 addresses. And, in fact, half way through his article in the section "Why None of this Matters Today" Eran agrees that his critiques don't really apply to the core OAuth 2.0 use cases. However it is right after that section that Eran gets to what seems to be really bothering him - his contention that the use of bearer tokens with discovery will make clients susceptible to replay attacks.
He never gave a detailed example but I believe something like the following captures his concern. Let's imagine a protected resource, we'll call it https://evil.example.com, advertises that its token endpoint is Google. So the client should go to Google, get an access token and then give it to https://evil.example.com. Evil.example.com will then turn around and replay the token to other services thus successfully impersonating the client. What makes the attack possible is that in a discovery based scenario a client has to discover both the protected resource endpoint and the token endpoint. This provides an opportunity for a bad service to lie about its token endpoint. In this case the lie would give the attacker a Google access token.
In practice however I don't believe this attack will work. The reason is that in discovery based systems (such as WS-Federation, SAML-P, etc.) one of the mandatory arguments is 'audience'. Audience defines the protected resource a requested token is going to be used with. Both the request for the access token and the access token itself will contain the audience value it is targeted at. The use of audience provides two levels of protection.
First, when the fooled client goes to Google's token endpoint to ask for an access token it will have to specify that it intends to use the access token with the protected resource https://evil.example.com. The logic here is trivial, the client is to take the location it will send the access token to and slap that in its request to the token endpoint. Right away Google will see that https://evil.example.com is not one of its supported endpoints and so will reject the access token request.
Even if the attack somehow worked and Google issued the access token, the token would still include the audience it was intended for, https://evil.example.com. So if evil.example.com tries to replay the token somewhere else the replay will fail because the protected resource it gives the token to will see that the audience value isn't addressed to it.
There is no magic here btw. In essence the inclusion of audience in a signed token plays the same role that signatures played in OAuth 1.0a. It associates the target of a request with the request itself and so prevents replay attacks.
Thus to make OAuth 2.0 useful for discovery based contexts we have to define how to submit the protected resource's URI in an access token request as well as require that the produced access token include that URI as an audience claim. This is exactly the sort of proposals I and others are working on in order to enable OAuth 2.0 to support discovery.
2 Reversing the attack - advertising the right token endpoint and the wrong protected resource
While I was reviewing Eran's proposed attack I wondered what would happen if we reversed the attack. In the reverse scenario an evil doer is targeting users of calendar.live.com and launches a phishing attack on them. Thanks to the phishing attack a user of calendar.live.com is fooled into asking their local calendar client to do discovery on https://calendar.evil.example.com. The returned discovery document says that its calendar endpoint (i.e. the protected resource) is https://calendar.live.com and its token endpoint is https://sts.evil.example.com. This is the reverse of the previous attack, in this case the token endpoint being advertised is correct (e.g. it's the one owned by evil.example.com), it's the protected resource location that is false.
The user uses their Google identity to log into calendar.live.com (a man can dream, can't he?) and so the client will present the user's Google credentials (or a token representing the same, it doesn't matter) to Google's token endpoint to get what's called an 'on-behalf-of' token targeted at https://calendar.live.com.
This part of the attack will work just fine because in a discovery based world Google is used to issuing on-behalf-of tokens for audiences it knows nothing about. So Google will happily produce the on-behalf-of token with an audience of https://calendar.live.com which the client will then send to the advertised token endpoint, https://sts.evil.example.com. And now the trap is sprung, evil.example.com now has a 100% genuine Google signed and issued 'on-behalf-of' token that it can replay to calendar.live.com's real token endpoint and so get an access token for https://calendar.live.com and do whatever it wants to the user's account.
The way to prevent the attack is to require that the request to Google and the issued on-behalf-of token contain both the protected resource address and the token endpoint address. Now if evil.example.com tries to replay the on-behalf-of token to Live's token endpoint the token will be rejected because while its audience value is good (e.g. it's for a protected resource the token endpoint belongs to) the token endpoint value won't match and so the request will be rejected.
Much thanks to Eran for helping to unearth this attack.