Skip to main content

OAuth Login should be standardized and this is why it cannot be


· 8 min read
Warren Parad

If you've landed here, on this page, it's probably because you too have asked, why does Auth and Login have to be so complicated for everyone. Fundamentally, the problem comes down to trust.

Background

I create a Stack Overflow account using my GitHub Identity via "Log in with GitHub"

Stack Overflow has the possibility to remove the GitHub login now.

If they do that I will no longer have access to my Stack Overflow account.

The problem comes down to "how does a client application such Stack Overflow know who the logged in user is?".

Some definitions:

  • Client Application - Here we'll denote Stack Overflow as a Client Application
  • Identity Provider - GitHub is the OAuth Identity Provider
  • First party identity provider - The identity providers your application enables for your users to log in
  • Third party identity provider - These are your SSO providers your customers enable in your application so that their users can log into your application

Where does the trust come in?

The short answer is that Stack Overflow is in control of this because they decide whom to trust. Stack Overflow decides which Identity Providers to trust, and how to identify users. Thus they decide which login mechanisms can be used.

If they delegate this to another provider, then that provider is on the hook for whom to trust. For instance Stack Overflow can delegate all of login to Google Login or directly to Authress.

Trust matters because without it a malicious identity provider could issue fake tokens that allow one user to impersonate another. The identity providers that Stack Overflow picks can decide to self issue tokens that represent their users and steal your data from Stack Overflow by pretending to be you. You might not be concerned if GitHub decided one day to delete some of your Stack Overflow posts or even remove your account. But what if you used GitHub to log into your bank.

You might say "I will never use GitHub to log into my bank", and that answer works for GitHub, but since Authress is an Identity Provider and our customers are Banks, there is a concern of trust here. However, unlike GitHub, when you implement Authress it comes with a terms of service and a contract.

This is also an issue with supporting Third party identity providers. One common problem is scaling up to enable SSO for your B2B customers. This is where you let them bring their own Identity Provider. By nature this Identity Provider is untrusted. You can't make the same trust claims about it as you might be for GitHub, Google, or Authress. However...You still need to trust it to verify identities that come from that provider. Some identity providers let you specify Third party identity providers, but unless they explicitly state that it's a solution for managing SSO in B2B contexts, take a step back before continuing. Just because you can, doesn't mean you should. And when it comes to adding identity providers, most cases you shouldn't. For in-depth details of which providers to select under different circumstances, check out the current Auth situation report.

In essence, this means only some identity providers can be trusted. As an application owner, you can't decide to arbitrarily trust any identity provider. Each identity provider you trust creates a new attack vector for your users' data. These are your First party identity providers. That subset of trusted identity providers is a problem that Stack Overflow cares about, and therefore has solved by selecting the Identity Providers they see as trustworthy. If they remove one from the list, it is because it was deemed untrustworthy.

There is no way around this.

So, how do we standardize?

Because Stack Overflow is on the hook for handling whom they Trust, it follows that every application has this same problem. So, standardization requires standardization of trust. Is that even possible?

NO!

Fundamentally, this isn't, and we'll see why below. To potentially circumvent the problem of standardized trust, let's review the existing three solutions:

1. Identity Aggregators

Authress and other identity aggregators allow for your application to select whom to trust. Upon trusting an identity provider, subsequently when the user logs in, your application will receive an opaque, but trusted identity token. This token is secured by Authress, we've spent a lot of work deciding how to trust every OAuth provider we support out of the box, and how to support all the rest that follow the OAuth2.1 specification.

One alternative solution is first-class browser support for identity management. Instead of your users logging into your application directly, they would log into their browser. Then the browser would allow your application to fetch these credentials from the browser assuming the user has approved this interaction. This is how Chrome supports Google identities. No other providers exist for this directly. Because, really this is just implementation detail. It doesn't matter how your application gets the user's identity, there is a user identity, and you trust some non-user relying party to sharing that identity with your application. In OAuth, these Identity Providers are actually known as the Authorization Server (AS).

Your application trusts the Identity Provider, which generates the user identity. However these identity providers don't just give out tokens to whomever asks for one. It would be nice if they did, because then your application could figure out the user dynamically without the user having to take extra steps. User goes to your application, and automatically you know who they are. This is exactly what you do for existing users using the session tokens generated by Authress, but this doesn't work for users that never logged in. And, in any case, there are privacy concerns here, so this is actually more problematic. In reality, this is a problem with ALL OAuth providers. And that is because they require client applications to register. Google Login refuses to generate tokens to your application without you going to Google, configuring your OAuth credentials, and then setting up your Identity Aggregator of choice, all before the user logs in the first time. They have to approve your application. (The reason for this is a bit complicated, but it boils down to the fact that some malicious actors ruined it for everyone by pretending to be Google Drive.)

It's worth mentioning that Dynamic registration is a partial solution. Dynamic Registration is a strategy that allows your application to ask Google for the user's token at runtime, without preregistration. That means you would not need to follow a complicated guide. This is really cool and could help reduce the burden here, but actually Google does not support dynamic registration. The fact is, many OAuth providers, which clients and users actually use, don't provide dynamic registration!

2. Web3 (Crypto)

So there's no trust-based solution for OAuth without dynamic registration as long as we have an Authorization Server or need an Identity Provider. The second solution attempts to do away with the identity providers altogether. For this we need to allow users to become their own Identity Provider. And there is actually an already available standard for this, known as Decentralized Identifiers DIDs.

However, this doesn't necessarily solve the problem either. And that's because your application still needs to decide which implementation to trust. Accepting DID tokens from anywhere is fine, as long as you implement the right logic, and the user's provider actually follows a DID implementation standard. The problem is that many Web3 implementations don't actually support DID, but you could of course limit yourself to ones that did. The standard exists, we can't force Stack Overflow to use it and neither can anyone force your app to trust DID based tokens. That still means at some point you could decide to stop supporting some subset of tokens generated. Even Web3 clients don't trust every provider. Often they only enable this mechanism and limit it to one trusted crypto, like BTC or ETH, or whichever one they are trying to sell to you. Most Web2.0 applications don't offer logging in with Web3, because they want your email address. Logging in with a DID solution means you don't know who the user is other than the fact that they hold some private key. And most applications still want to email their users as if that would solve the global climate crisis.

Since you are asking for an email, and an unwilling to use an alternative, that means that arguably email address is the solution to this problem, but let's quickly see solution #3.

3. WebAuthn

The last exciting existing solution is WebAuthn. WebAuthn provides a similar implementation to the Web3 consistent Standard for login. However, again Stack Overflow and your application have to decide to trust the implementation.

WebAuthn and really FIDO2 has tried to make it as easy as possible for client applications to accept the one Standard, however as we've seen that there are still problems such as the improper usage of the ResidentKey implementation for hardware tokens. Even without the problems, most application portals today do not support WebAuthn--even though some identity aggregators do support it, such as Authress and identity providers such as GitHub and Google. Again it's up to your application to decide if you want to support it.

Solution: Email?

Arguably, email is the solution and has always been the solution. The problem is E-Mail is E-Mail. The standard for email itself is old and bad, and the providers for E-Mail are also old and bad.

With an email address you can:

  • Log into sites that allow username/password
  • Log into sites that allow WebAuthn
  • Log into sites that support federated login if the federated provider let's you bring an email
  • Web3 DID doesn't support email 🙁 - but that's kind of the point

There's still a problem associating the email across login mechanisms, logging in with GitHub via my email is not the same as logging in via Google. Those are different identities unless your application trusts both of them to have verified your email address. In other words, as an application do you trust GitHub when it gives you a user's email? How do you know that user's email is what GitHub thinks it is, or even did GitHub verify that email address before giving it to you. Thankfully, we know GitHub did, but others definitely didn't. That means you can't just send emails to any email address any identity provider gives you, unless you trust that they did their job.

And in any case, that means we are back to the trust problem.

One way around this would be browsers to implement identities that are tied to email addresses and then provide those to the client application to let the user pick which one to use.

Credit xkcd

Conclusion

As we've seen up to this point, as a user, we really have to trust the application with our data. Not just with the data we give them, but that they delegate to the right identity provider and in the right way. Because fundamentally, we can't take our identity with us. With some of the provider implementations listed above we can certainly try, and some users do that with extensions, a special browser, a special OS, or a hardware device to allow them to move from one browser to the next one.

Obviously, one way around this would be legal mandates on client applications, we could require applications implement some standards, but on the web, voting happens my majority, and that's if we are lucky. Thankfully to counteract this, GDPR and the like require you as a user to have the ability to take your data with you. You can take your data with you at any point, so it doesn't matter if you can't log in, you can still have those precious binary ones and zeros.

However, none of these are real solutions. Fundamentally, the application decides whom to trust, and we can't make an application provider trust other things, just because as a user we want them to. Likewise, if Stack Overflow delete our data, and there is very little we can do about it, irrelevant of what many "data protection / privacy advocates" believe. Stack Overflow owns it, maybe not legally, but factually. And they could at any moment, just like InfluxDB did.

✶ ✶ ✶

Figuring out which identity providers to trust is hard, but we've already done the work with Authress Authentication, which optimizes for SSO security for your customers.