Building native product integrations with popular apps such as Salesforce, Slack, or Jira into an application always seems simple at the beginning. All you’d need to do is build an authentication mechanism, make a few API calls, and deploy it to your customers, right?
Unfortunately, many engineering teams spend weeks stuck just at step 1 - authentication.
During the process of building an authentication infrastructure that provides fully managed auth for over 100 SaaS integration connectors, as well as any integration built through our Custom Integration Builder, we uncovered and overcame challenges that even our team could’ve never anticipated when we set out on this journey to build Paragon years ago.
I thought it would be helpful to share our learnings with you in case you still wanted to build your product’s native integrations in-house, even though Paragon solves this problem out-of-the-box and at scale for your team.
Auth <> Security
We can’t talk about authentication without discussing security.
When handling auth for native integrations, whether it be through API Keys, OAuth, or even username and passwords (in rare cases), you’ll have to store multiple credentials for each of your users. Depending on your integration use case, you may be storing credentials/tokens that give you access to sensitive data in their accounting systems, to private Slack/Teams messages, to employee data in their HR platform - you get the idea.
And with great power comes great responsibility - your auth services need to be bulletproof from a security standpoint. It’s crucial to treat tokens like passwords, because they can be functionally equivalent in most cases.
Sure, it’s easy enough to store these credential values as plaintext in some database table that you can grab whenever you need to make a request on behalf of your users. But by doing so, you’re leaving yourself extremely vulnerable to data breaches, which can leak access to data across dozens of your customers’ other apps - a mistake that you may not ever recover from.
As an example, back in April 2022, Heroku was compromised and their users’ GitHub integration OAuth access tokens were stolen, causing several private GitHub repositories to be breached and cloned. Even npm had their data harvested, an event that was immediately covered by dozens of publications, including Forbes.
Paragon’s Approach to Securely Storing Credentials
At Paragon, we wanted to ensure that even in the worst-case scenario in which a database is compromised, the attacker cannot obtain decrypted credentials.
1. Encryption & Storage
We ensure that your customers’ integration credentials are symmetrically encrypted before they are stored.
Encryption keys are stored independently in a separate database, and whenever we need to access the decrypted credentials to make an API call, our Workflow service will fetch the encrypted value and the associated encryption key and decrypt it locally in memory.
2. Penetration tests
We regularly pen test our infrastructure to ensure that it’s equipped to prevent attackers from getting unauthorized access to both our customers’ and their users’ credentials.
Challenges with OAuth
While many services do use API Keys to authenticate requests, over 50% of the top SaaS companies use OAuth 2.0 in order to authorize requests to their API.
Generally, implementing OAuth based authorization flows for integrations involves setting up services to handle:
- The initial authorization request to get the access and refresh tokens
- Storage of the access and refresh tokens
- Authenticating requests to the 3rd party API with the access tokens
- Using the refresh tokens to get new access tokens
Many teams initially think that since OAuth 2.0 is a ‘standard’, it would be trivial to implement across dozens of integrations.
However, there are many hidden challenges that need to be overcome in order to build an auth layer for your integration roadmap - challenges that even our own team didn’t anticipate when we initially set out to build Paragon a few years ago, and spent years of dedicated engineering to solve.
Let’s talk about some of those challenges.
An Unstandardized Standard
The main issue with OAuth is that it isn’t really a protocol. Rather, it’s a skeleton of a protocol and everything is dependent on how the app developer decides to implement it.
Every app has its own interpretation of the OAuth standard (just look at each 3rd party application’s API documentation), which introduces significant variations and inconsistencies when it comes to how you need to address it.
Here are just a few of the many examples we’ve run into:
- The state parameter in the OAuth authorization request should support URL-encoded values but Twitter does not.
- While you have to specify the scope for most apps, Mailchimp and Notion don’t use them.
- Every app can have very different refresh token policies. For Google, refresh tokens never expire and you get to create as many as you’d like. But for Salesforce, each refresh token expires on a user-configurable basis and you can only issue 5 of them at a time.
- Some apps require a Proof of Key for Code Exchange (PKCE), which adds additional requirements and steps in the Authorization Code Flow, while others do not.
The list goes on, but the greatest challenge is undoubtedly around handling token refresh.
Complexities with Refreshing Tokens
Under OAuth, access tokens typically have a time-to-live or TTL (the expires_in parameter of a token response) before expiring and becoming invalid.
When it expires, new access tokens can be obtained using the provided refresh token (as shown in the diagram earlier).
To prevent your users’ access tokens from expiring (which will break the connection, cause requests to fail, and inconveniently require your user to authenticate again), your authentication service needs to refresh them periodically in the background. But how?
Approach #1: Refreshing before every Request
The easiest implementation is to get a new access token using the refresh token each time when an API call is made.
But as you can imagine, this scales poorly because you would have to double the number of requests your integration services need to make, which can easily lead to rate-limiting and load balancing issues.
Additionally, with integration use cases that don’t run jobs in the background (such as user-triggered workflows), longer durations of inactivity can lead to even refresh tokens expiring.
Approach #2: Refreshing Periodically in the Background
So instead of refreshing before every request, we landed on a much more robust (though complex) approach. Instead of refreshing tokens before making a request, Paragon runs a background job that refreshed all our users’ tokens periodically by sending requests to sample endpoints to determine if an access token was still valid or not.
This approach resolves the two issues we outlined earlier - running into rate limits and tokens expiring because of inactivity.
However, implementing this into our auth infrastructure was significantly more complex than the first approach, as it led to us having to handle many complications and edge cases, including:
- Differing refresh policies
- Preventing race conditions
- Forced De-authorization
- Ambiguous errors
Differing Refresh Policies
To start, each app you want to integrate with may have implemented the token refresh flow differently.
Some apps let you keep your existing refresh token indefinitely.
- ie. Google, where you can use a single refresh token forever
Some apps will issue you a new refresh token after every access token retrieval.
- ie. QuickBooks, which gives you a new refresh token every time you get a new access token
Some apps limit how many refresh tokens you can generate per organization
- ie. Salesforce limits an organization to 5 refresh tokens - any more and the rest are immediately invalidated
Some apps have expiring refresh tokens (expiry completely up to the app developer).
- ie. Netsuite’s refresh tokens expire every 3 hours if unused, Outlook refresh tokens after 60 days, etc.
Some apps have different Inactivity and Absolute Expirations
- ie. Jira has a 90 day inactivity expiry and a 365 day absolute expiry (which forces your users to re-auth no matter what)
While each of these adhere to the general OAuth standard, you can’t leverage the same approach to handle every integration’s auth. Not accounting for all the edge cases can lead to many production-level challenges with your integration after going live.
That’s why our integrations engineering team had to become OAuth experts in order to build the unified layer for auth that all our customers rely on for their products’ native integrations.
On the bright side, this led to Paragon releasing its Custom Integration Builder which enables customers to rely on our authentication service for any native integration, beyond our 100+ pre-built connectors.
Preventing Race Conditions
If you’re able to comprehensively handle the complications with the varying refresh policies, next comes the challenge of preventing race conditions when refreshing tokens. Never fun to deal with when it comes to distributed systems.
Just as one example, if a token is being refreshed, but a concurrent request is made to the 3rd party API, what do you do?
In many cases, such as with QuickBooks, if you accidentally use a stale token, it will not only return an 400 unauthorized error on the request, it will also invalidate all other tokens, including the one being refreshed.
To prevent race conditions, we introduced a token mutex as a locking mechanism.
This means that if a refresh job obtained the token mutex, all requests to that specific 3rd party service would be paused until it completes.
Handling Forced De-authorization
If that wasn’t enough, you have to also deal with forced de-authorization from the app’s side (which is more common than we expected). For example, we’ve seen vigilant Salesforce and Google Workspace admins manually revoke several connected apps at once. Since the app is deauthorized, it’s not always reliable to depend on the access token TTL to check its validity.
I hinted at this earlier, but since we can’t rely on the TTL, we use sample endpoints for each app to test if a token was still valid.
For example, [.inline-code-highlight]GET /rest/api/3/mypreferences/locale[.inline-code-highlight] for all Atlassian applications - if a 200 Authorized response is returned, our service will know that the token is still valid - but otherwise it will use the refresh token to get a new access/refresh token.
Ambiguous Authentication Errors
Finally, debugging auth errors. There are very few services that we’ve built integrations for where we felt that they provided sufficient explanations as to why an error occurred, and in most cases the 3rd party app’s API docs completely lack details on auth errors, or provide very generic and unhelpful resources.
To make our OAuth client reliable and make debugging easier, we needed to be able to identify which errors are recoverable and which are not.
While OAuth outlines standardized errors, which are invalid_request, invalid_client, invalid_grant, unauthorized_client, unsupported_grant_type, and invalid_scope, due to all the different policies listed earlier, it is incredibly difficult to debug, especially across dozens of services.
This led us to creating a repository of error responses such that our auth service can identify which errors are recoverable and which ones aren’t, which has taken years to compile and is constantly being updated as changes are made to the 3rd party app’s API and authentication flow.
Closing Thoughts
Although auth is just the first step in building any native integration for your application, it is incredibly complex to get right and is the pre-requisite to any of your integrations functioning properly.
That’s why it was critical for our customers that Paragon provides fully managed authentication - we wanted to take on the burden of auth so your team can focus on challenges that are unique to your product and business.
If you want to see how you can use Paragon to handle auth for your integrations, book some time with our team here.
But if you do decide that your team should still be the one to own these challenges, I hope this article provides a roadmap/blueprint for building your own authentication infrastructure for integrations.