Rate-limiting API calls

June 19, 2017 | Glynn Bird | API

With any API, if you exceed its rate limits, your request will get a “HTTP 429 Too many requests” response. The Cloudant Node.js has a ‘retry’ plugin that will resend such requests. This approach is handy when you only occasionally are hitting a limit, perhaps at times when your site or app is unusually busy.

If you are routinely exceeding the quota, then no amount of retrying will help because your app will be systematically retrying a swathe of requests. In that case you need to look at upgrading your API access (if possible) or adding a layer of abstraction to handle your request rate.


Photo by Bradley Wentzel on Unsplash

I’ll use Cloudant’s most basic Lite plan as an example. Here are the data and API rate limits (at time of writing) for the database service:

To use this plan as efficiently as possible and keep your write requests to 10 per second, you have two options:

  1. make bulk requests — instead of writing 50 documents individiually, write all fifty in one call to POST /db/_bulk_docs

  2. queue your requests and only allow the queue to be consumed at rate that is less than the permitted level

Implement a rate-limited queue with qrate

I wrote a Node.js module to help with the latter option. The qrate library lets you create queues and specify:

Here’s how it works. First, bring in the cloudant-quickstart library to access a Cloudant database and the qrate package too:

const cqs = require('cloudant-quickstart');
const qrate = require('qrate');
const db = cqs('https://USER:PASS@HOST.cloudant.com/queue');

Then, define a “worker” function that deals with a single queue item. In this case, you want to write a document to Cloudant. The worker function receives the ‘document’ and calls ‘done’ when it’s finished:

// the worker function:
// writes the document to Cloudant
const worker = function(document, done) {
  console.log('worker', document);

You can then create a rate-limited queue using the qrate module:

var concurrency = 3; // three workers at a time
var rateLimit = 10; // maximum 10 items per second
var q = qrate(worker, concurrency, rateLimit);

Then, feed the documents you want to add to the database to the queue (q) with q.push:

for(var i = 0; i < 100; i++) {
  q.push( { i: i, name: 'hello world' } );

Even though there are 100 items in the queue and up to three workers running at once, the queue rate never exceeds 10 per second. So no 429 responses are received from Cloudant and no retry logic is required.

In a real application, you would add documents you want to save to the queue instead of writing them directly to Cloudant. The queue would ensure that the writes happen slower than the prescribed rate, with the excess building up in memory.

This approach is useful for processing calls to an API service that has a rate limit!

If the qrate API looks familiar to you, then that is because it is based on the excellent async library, which provides a range of tools for writing asynchronous code for JavaScript. The qrate library is essentially the same as async.queue with an extra, optional rateLimit parameter. Without it, it behaves as a normal async.queue.