Rate Limits are a vital part of every API. Especially, since we operate quite a lot of public and unprotected APIs we need to rate limit them.
Implementing Rate limits based on IP addresses is fairly easy, especially with Amazon API Gateway and Amazon WAF. But limiting requests based on other user identifiers such as user IDs in JWTs or authorization headers can get quite tricky.
That is where Stellate comes to the rescue π¦Έπ½
Our Architecture
A small primer to our architecture and all involved parts:
We have two methods of accessing our API: either through client calls or via server-side-rendered calls from Vercel. Both pass through Stellate's Edge Cache and now also utilize their rate-limiting feature. Following this, we make calls to the API Gateway on AWS.
Why not only IP?
The first question we mostly got about rate limits is: Why don't you only rate limit on IP?
While it makes sense to rate limit on the IP address it often is a misleading identifier. Due to the shortage of IPs, many mobile carriers or internet providers share the IP across several zones. Also, universities, dorms, and companies are often using the same IP. If we would rate limit this one IP the whole university couldn't access Hashnode anymore. This is not what we want.
Why Rate Limits?
There are mainly two reasons why we need rate limits.
Impact on the Database
First of all, we want to secure our database. You don't want people to be able to hit your database constantly. This will incur costs and can lead to downtime.
Yes, caching is the number one thing to consider here. Stellate & Vercel are helping us with that already. But rate limits also help by ensuring that nobody bombards your API. Everything that isn't cached (e.g. Mutations
) hits the DB directly. We want to avoid that.
Impact on your Business
The second reason is, you simply don't want that your product gets abused. We don't want an automatic generation of posts running in a script.
We need rate limits to ensure that nobody abuses our API and impacts the database.
Stellate Rate Limiting
Stellate is a CDN for GraphQL. It mainly offers the functionality of:
Caching GQL Requests on the Edge
Analytics and errors about your API
Rate Limiting
The rate-limiting feature is currently in its public beta phase. Check out their docs for more information.
To enable rate limits you can simply add the rateLimits
field to your Stellate config file (with TypeScript support! π):
import { Config } from 'stellate'
const config: Config = {
config: {
rateLimits: [
{
name: 'IP limit',
groupBy: 'ip',
state: 'dryRun',
limit: {
type: 'RequestCount',
window: '1m',
budget: 50,
},
},
],
},
}
export default config
The code above creates a rate limit of 50 requests for every minute. The state dryRun
means that this rate limit is not really available. Your dashboard will only show you which requests would be blocked but they won't be blocked.
Dry Run
The dry run mode in Stellate is an excellent feature for gaining a better understanding of the appropriate rate limit. Once you've activated it you can head over to your Stellate Dashboard, check the rate limiting dashboard, and see how many requests and customers would have been blocked. But no request will be blocked.
You can also send requests from the Stellate playground or from the API Client of your choice (cURL, Postman, Insomnia) and check the remaining budget.
In this example, I query my blog. In the result window on the right, I can see that the rule "Unatuehtnicated IP Limit - Request Count" was applied. I have 1998 of 2000 requests remaining.
Rate Limits
So far the introduction. But how do we use rate limits at Hashnode? We distinguish mainly from two different limits:
Authenticated access
Unauthenticated access
Authenticated Access
Authenticated access is everything where a token in a cookie or header is present. If this token is present we create a limit of 500 requests per minute.
{
name: 'Authenticated User Limit - Request Count',
groupBy: req.headers['token'],
state: 'enabled',
limit: {
type: 'RequestCount',
budget: 500,
window: '1m'
}
}
This defines that each header token
will have a limit of 500 requests per minute. To test this out you can also make use of Stellate's dashboard.
Here we query my personal blog and access the title
. At the bottom of the result, we can see the remaining limit. In this case, we have 499 of 500 requests left.
Unauthenticated Access
Unauthenticated access, on the other hand, is everything without an authentication token. In this case we group by the IP address of the user. This limit has 2000 requests per minute.
Why is this limit larger?
First of all, unauthenticated requests are typically cheaper in terms of computational costs. Querying a blog vs. creating a blog is a huge difference.
The second reason is the reason of IP sharing. We saw a lot of cases in that IPs are shared. If this is the case we don't want to have a too tight budget. This is why we allow quite a bit more room for unauthenticated access.
Rate Limits & Server-Side Rendering
Hashnode makes heavy usage of Vercel and Server-Side Rendering (SSR). The problem with SSR & Rate limiting is that many customers can visit blogs that will be server-side rendered from the same server. This will then come from the same IP address.
There are separate solutions to take care of that:
Ignore SSR for rate limits
Forward the public IP & authorization header (if present) to Stellate.
(New) With Vercel Secure Compute assign a fixed IP to Vercel and whitelist this one
We opted for the first solution, ignoring all SSR calls. We primarily chose this option because we wanted to address rate limiting for the API. This is also a preparatory step for making our API publicly available. It is not specifically intended to rate limit the client's usage.
You can do that by defining a secret between Vercel & Stellate. This secret can for example be a header you'll forward to each API Requests.
β οΈ Be aware that this header needs to be treated as a secret. You can only send it from the server side not from the client side.
In Stellate you can then define the following:
if (
req.headers['ssr-call'] &&
req.headers['ssr-call'] === "123"
) {
return [];
}
This will return no rate limit in case the call is coming from Vercel.
Block IPs
One more remarkable feature is the ability to block individual IP addresses. Unfortunately, we face attacks quite frequently. Often, these attacks originate from a single IP address. Blocking such an IP address using rate limits is incredibly simple:
if (ipListToBlock.includes(req.ip)) {
return [
{
name: 'Blocked IP limit',
groupBy: 'ip',
state: 'enabled',
limit: {
type: 'RequestCount',
budget: 0,
window: '1m'
}
}
];
}
That's it π
That is all about rate limits.
It already saved us from a huge abusive spike in traffic to our API and it is super easy to implement!