Part 9: OAuth 2.0: Authorization, Not Authentication
This is part nineÂ of a tutorial blog series from Ben Finkel addressing the challenges, solutions, and implementation of sound authentication. By the end of this series, you will be confident in your ability to implement an authentication system â€” even with little-to-no background.
The phrase "OAuth 2.0 is an authorization framework, not an authentication framework" comes up a lot when you're researching OAuth 2.0. It can be tiresome, because it's often offered up without any context. It's a nice statement, but what exactly does it mean? If you've been following along with this blog series you should be able to understand at this point why it's an issue.
Recall in our previous post we added Github as a provider, but that Github's implementation was dramatically different than Google's. Every provider has a slightly different implementation, and if their documentation is not up to speed it can be difficult to figure out how to get identity information from a provider. In fact, there is nothing in the OAuth 2.0 specification that dictates any identity information has to be supplied at all. It's completely up to the provider's discretion.
On top of that, the OAuth 2.0 spec allows for a fair amount of flexibility in how data is passed back and forth. Values can be supplied via HTTP Headers, or in a POST body, or as a querystring. They can be formatted as XML or JSON, or any other format the provider chooses.
All in all, it makes for a headache whenever we want to implement OAuth 2.0 for any purpose, but especially for authentication.
OpenID Connect to the rescue
The OpenID connect specification attempts to alleviate all of these problems. By standardizing the problems above, including a definition of how identity information is passed along, consumers can write a fairly simple piece of code that will work to authenticate against *any* OpenID connect provider. The only variables in the equation are endpoints, client ids, and scope.
For our library, we're going to copy the googleOAuth.py file and treat OpenID as simply another provider. This will allow it to seamlessly integrate with our structure. It will also be a very minimal amount of work because Google was already implementing the OpenID connect specification most of the way (that's one reason we chose to start with them). In fact, if you were exploring the http requests and responses closely, you may have noticed a funny piece of data included on the access token response from Google called id_token. Guess what, that's our OpenID connect data. All we need to do is decode that token.
Let's look at openid.py and see just how we establish an OpenID provider.
First, we'll need to import two additional libraries: jwt and base64. These will be used later to decode that token I just described.
After that, we get rid of our hard-coded client and endpoint variables. Those will all change per provider. Instead, we've created a "provider_matrix" that contains all necessary data for each provider. We'll start with two OpenID providers, Google and Yahoo.* Notice the scope has to be slightly different between them. The OpenID spec defines openid as one required value in the scope, but getting an email address requires a provider-specific request as well (email in the case of Google and sdpp-w for Yahoo). It's a shame we'll still need some provider specific documentation to make this work, but it's dramatically less than we required before.
Also, note the two parameter dictionaries have been totally blanked out so they can be filled in dynamically.
The SignIn function changes very little. We just need to populate the endpoint_auth and query_auth values with the appropriate provider details. One important note: We had to add provider as a variable being passed into this function. For compatibility sake I also added that as a parameter in the Google and Github OAuth implementations (even though they do nothing with it).
Lastly, we update our GetAccessToken function. At the top half we populate with the appropriate provider details just like the SignIn function (and change the parameter set for the function in all of our implementations accordingly).
The second half of the function is where we extract the email details out of the JWT that was supplied to us in the response. You can learn more about JWT here: https://jwt.io
It's a simple object, however. Three individually Base64 encoded JSON objects that are separated by a period. We split the token and take the second object which represents our payload. The first object is the header and the third is a signature (which can and should be used for verification). We base64 decode the payload, load it from JSON into a python object, and extract the email from a parameter called email.
This object is very narrowly defined in the OpenID specification so we can confidently work with it just like this for any and all OpenID providers.
That's all we need to do for our implementation. 100 lines of code and we're ready to go. Let's just update our other files to include this provider.
OAuth.py is updated just like adding a new provider. We import it at the top and add the appropriate functions to our two dictionaries. We also have to update our calls to signInFunc and accessFunc to include the provider name (since openid.py needs to use that to choose endpoints). This is why we update the provider-specific code to accept a provider variable as well.
Finally, we update our HTML to include login links for our new OpenID Connect-friendly providers.
Deploy this code to your App Engine application and authenticate to your heart's content!
You can read the full tutorial series in these weekly installments:
Part 2: Delegating Authentication
Part 9: OAuth 2.0: Authorization, Not Authentication
Public Git for entire codebase: https://github.com/benfinkelcbt/OAuthLibrary
*Yahoo OpenID Connect Docs and Developer Network: https://developer.yahoo.com/oauth2/guide/openid_connect/
delivered to your inbox.