Best & Worst Practice

January 18, 2018 | Stefan Kruger | Best-practice

So you’re new to Cloudant, but you’re not new to database systems. As an offering manager and engineer for Cloudant the past four years, I’ve had the chance to see the product from all angles: the customers who use it, the engineers that run it, and the folks who support and sell it.

image

In many ways, this article is inspired by Dimagi’s perspective as users. I’d like to add our perspective as providers of the database service, and summarize the best — and worst! — practices we see most often in the field.

Rule 0: Understand the API you are targeting

You may use Java, or Node.js or some other use-case specific language or platform that likely comes with convenient client-side libraries that integrate Cloudant access nicely, following the conventions you expect for your tools. This is great for programmer efficiency, but it also hides the API from view.

This abstraction is what you want, of course — the whole reason for using a client library is to save yourself repeated, tedious boiler-plating — but understanding the underlying API is vital when it comes to troubleshooting, and when reporting problems. When you report a suspected problem to Cloudant, it helps us help you if you can provide a way for us to reproduce the problem.

This does usually not mean cutting and pasting a hefty chunk of your application’s Java source verbatim into a support ticket, as we’re probably not able to build it. It usually does mean introducing uncertainties as to where the problem may be — your side or our side?

Cloudant’s support teams will usually ask you to provide the set of API calls, ideally as a set of curl commands that they can run, that demonstrates the issue. Adopting this approach to troubleshooting as a general rule also makes it easier for you to pinpoint where issues are failing. If your code is behaving unexpectedly, try to reproduce the problem using only direct access to the API. If you can’t, the problem isn’t with the Cloudant service itself.

If you suspect that a problem you’ve encountered lies with an officially supported client library, then try to construct a small, self-contained code example that demonstrates the issue, with as few other dependencies as possible. If you’re using Java, it is helpful to us if you can use a minimal test harness to highlight library issues.

Occasionally Cloudant receives support tickets that state that “Cloudant is broken because my application is slow” without much in terms of supporting evidence. Nearly always this is traced back to issues in the application code on the client side, or misconceptions about how Cloudant works.

Not always, but nearly always.

By understanding the API better, you also gain experience in how Cloudant behaves, especially in terms of performance.

Rule 1: Documents should group data that mostly change together

When you start to model your data, sooner or later you’ll run into the issue of how your documents should be structured. You’ve gleaned that Cloudant doesn’t enforce any kind of normalisation and that it has no transactions of the type you’re used to from, say, Postgres, so the temptation can be to cram as much as possible into each document, given that this would also save on HTTP overhead.

This is often a bad idea.

If your model groups information together that doesn’t change together, you’re more likely to suffer from update conflicts.

Consider a situation where you have users, each having a set of orders associated with them. One way might be to represent the orders as an array in the user document:

{ // DON'T DO THIS
    "customer_id": 65522389,
    "orders": [ {
      "order_id": 887865,
      "items": [ {
          "item_id": 9982,
          "item_name": "Iron sprocket",
          "cost": 53.0
        }, {
          "item_id": 2932,
          "item_name": "Rubber wedge",
          "cost": 3.0
        }
      ]
    }    
  ]
}

To add a new order, I need to fetch the complete document, unmarshal the JSON, add the item, marshal the new JSON, and send it back as an update. If I’m the only one doing so, it may work for a while. If the document is being updated concurrently, or being replicated, we’ll likely see update conflicts.

Instead, keep orders separate as their own document type, referencing the customer id. Now the model is immutable. To add a new order, I simply create a new order document in the database, which cannot generate conflicts.

To be able to retrieve all orders for a given customer, we can employ a view, which we’ll cover later.

Avoid constructs that rely on updates to parts of existing documents, where possible. Bad data models are often extremely hard to change once you’re in production.

Rule 2: Keep documents small

Cloudant imposes a max doc size of 1 MB. This does not mean that a close-to-1-MB document size is a good idea. On the contrary, if you find you are creating documents that exceed single-digit KB in size, you should probably revisit your model. Several things in Cloudant becomes less performant as documents grow. JSON decoding is costly, for example.

Rule 3: Avoid using attachments

Cloudant has support for storing attachments alongside documents, a long-standing feature it inherits from CouchDB. It can be handy to be able to store small icons and other static assets such as CSS and JavaScript files with the data if you’re using Cloudant as a backend for a web application.

There are a few things to consider before using attachments in Cloudant today, especially if you’re looking at larger assets such as images and videos:

  1. Cloudant is expensive as a block store
  2. Cloudant’s internal implementation is not efficient in handling large amounts of binary data

So: slow and expensive.

It’s ok for small assets and/or occasional use, but as a rule, if you need to store binary data alongside Cloudant documents, it’s better to use a separate solution more suited for this purpose, and store only the attachment metadata in the Cloudant document. Yes, that means some extra code you need to write to upload the attachment to a suitable block store of your choice, verify that it succeeded before storing the token or URL to the attachment in the Cloudant document.

Your databases will be smaller, cheaper, faster, and easier to replicate.

Rule 4: Fewer databases are better than many

If you can, limit the number of databases per Cloudant account to 500 or fewer. Whilst there is nothing magic about this particular number (Cloudant can safely handle more), there are several use cases that are adversely affected by large numbers of databases in an account.

The replicator scheduler has a limited number of simultaneous replication jobs it is prepared to run. That means that as the number of databases grow, the replication latency is likely to increase if you try to replicate everything contained in an account.

There is an operational aspect which is the flip side of the same coin: Cloudant’s operations team relies on replication, too, in order to move accounts around. By keeping down the number of databases you help us help you, should you need to shift your account from one location to another.

So when should you use a single database and distinguish between different document types using views, and when should you use multiple databases to model your data? Cloudant can’t federate views across multiple databases, so if you have data that is unrelated to the extent that they will never be “joined” or queried together, then that data could be a candidate for splitting across multiple databases.

Rule 5: Avoid the “database per user” anti-pattern like the plague

If you’re building out a multi-user service on top of Cloudant, it is tempting to let each user store their data in a separate database under the application account. That works well, mostly, if the number of users is small.

Now add the need to derive cross-user analytics. The way you do that is to replicate all the user databases into a single analytics DB. All good. Now, this app suddenly becomes successful, and the number of users grow from 150 to 20,000. Now we have 20,000 replications just to keep the analytics DB current. If we also want to run in an active-active DR setup, we add another 20,000 replications and basically the system will stop functioning.

Instead, multiplex user data into fewer databases, or shard users into a set of databases or accounts, or both. That way there is no need to replicate to provide an analytics DB, but auth becomes more complicated as Cloudant only provides authentication at the database level.

It’s worth stating that the “database-per-user” approach is tempting because Cloudant permissions are “per database” — it’s not really the users’ fault that this pattern has emerged.

Rule 6: Avoid writing custom JavaScript reduce functions

The map-reduce views in Cloudant are awesome. However, with great power comes great responsibility. The map-part of a map-reduce view is built incrementally, so shoddy code in the map impacts only indexing time, not query time. The reduce-part, unfortunately, will execute at query time. Cloudant provides a set of built-in reduce functions that are implemented internally in Erlang, which are performant at scale, and which your hand-crafted JavaScript reduces are not.

If you find yourself writing reduce functions, stop and consider if you could re-organise your data so that this isn’t necessary, or so that you’re able to rely on the built-in reducers.

Actually, we can be even clearer on this: if you find you are using custom JavaScript reduces, you’re doing it wrong.

Rule 7: Understand the trade-offs in emitting data or not into a view

As the document referenced by a view is always available using include_docs=true, it is possible to do something like:

emit(doc.indexed_field, null);

in order to allow lookups on indexed_field. This has advantages and disadvantages.

  1. The index is compact — this is good, as index size contributes to storage costs
  2. The index is robust — as the index does not store the document, you can access any field without thinking ahead of what to store in the index
  3. The disadvantage is that getting the document back is more costly than the alternative of emitting data into the index itself, as the database first has to look up the requested key in the index, and then read the associated document. Also, if you’re reading the whole document, but actually need only a single field, you’re making the database read and transmit data you don’t need.

This also means that there is a potential race condition here — the document may have changed, or been deleted between the index and document read (although unlikely in practice).

Emitting data into the index (a so-called “projection” in relational algebra terms) means that you can fine-tune the exact subset of the document that you actually need. In other words, you don’t need to emit the whole document. Only emit a value which represents the data you need in the app i.e. a cut-down object with minimal details, for example:

emit(doc.indexed_field, {name: doc.name, dob: doc.dob});

Of course, if you change your mind on what fields you want to emit, the index will need rebuilding.

Cloudant Query’s (CQ) JSON indexes use views this way under the hood. CQ can be a convenient replacement for some types of view queries, but not all. Do take the time to understand when to use one or the other.

Rule 8: Never rely on the default behaviour of Cloudant Query’s no-indexing

It’s tempting to rely on CQs ability to query without creating explicit indexes. This is extremely costly in terms of performance, as every lookup is a full scan of the database rather than an indexed lookup. If your data is small, this won’t matter, but as the dataset grows, this will become a problem for you, and for the cluster as a whole. It is likely that we will limit this facility in the near future. The Cloudant dashboard allows you to create indexes in an easy way.

Creating indexes and crafting CQs that take advantage of them requires some flair. To identify which index is being used by a particular query, send a POST to the _explain endpoint for the database, with the query as data.

Cloudant Query docs.

Rule 9: In Cloudant Search (or CQ indexes of type text), limit the number of fields

Cloudant Search and CQ indexes of type text (both of which are Apache Lucene under the hood) allow you to index any number of fields into the index. We’ve seen some examples where this is abused either deliberately, or most often fat-fingered. Plan your indexing to comprise only the fields required by your actual queries. Indexes take space and can be costly to rebuild if the number of indexed fields are large.

There’s also the issue of which fields you store in a Cloudant Search. Stored fields are retrieved in the query without doing include_docs=true so the trade-off is similar to Rule 7.

Rule 10: Resolve your conflicts

Cloudant is designed to treat conflicts as a natural state of data in a distributed system. This is a powerful feature that helps a Cloudant cluster be available at all times. However, the assumption is that conflicts are still reasonably rare. Keeping track of conflicts in Cloudant’s core has significant cost associated with it.

It is perfectly possible (but a bad idea!) to just ignore conflicts, and the database will carry on operating by choosing a random, but deterministic revision of conflicted documents. However, as the number of unresolved conflicts grows, the performance of the database goes down a black hole, especially when replicating.

As a developer, it’s your responsibility to check for, and to resolve, conflicts — or even better, employ data models that make conflicts impossible.

If you routinely create conflicts, you should consider model changes.

Rule 11: Deleting documents won’t delete them

Deleting a document from a Cloudant database doesn’t actually purge it. Deletion is implemented by writing a new revision of the document under deletion, with an added field _deleted: true. This special revision is called a tombstone. Tombstones still take up space and are also passed around by the replicator.

Models that rely on frequent deletions of documents are not suitable for Cloudant.

Rule 12: Be careful with updates

It is more expensive in the longer run to mutate existing documents than to create new ones, as Cloudant will always need to keep the document tree structure around, even if internal nodes in the tree will be stripped of their payloads. If you find that you create long revision trees, your replication performance will suffer. Moreover, if your update frequency goes above, say, once or twice every few seconds, you’re more likely to produce update conflicts.

Prefer models that are immutable.

The obvious question after rules 11 and 12 is — won’t the data set grow unbounded if my model is immutable? If you accept that deletes don’t fully purge the deleted data and that updates are actually not updating in place, in terms of data volume growth there is not much difference. Managing data volume over time requires different techniques. The only way to truly reclaim space is to delete databases, rather than documents. You can replicate only winning revisions to a new database and delete the old to get rid of lingering deletes and conflicts. Or perhaps you can build it into your model to regularly start new databases (say ‘annual data’) and archive off (or remove) outdated data, if your use case allows.

Rule 13: Replication isn’t magic

“So let’s set up three clusters across the world, Dallas, London, Sydney, with bidirectional synchronisation between them to provide real time collaboration between our 100,000 clients.”

No.

Cloudant is good at replication. It’s so effortless that it can seem like magic, but note that it makes no latency guarantees. In fact, the whole system is designed with eventual consistency in mind. Treating Cloudant’s replication as a real time messaging system will not end up in a happy place. For this use case, put a system in between that was designed for this purpose, such as Apache Kafka.

It’s difficult to put a number on replication throughput — the answer is always “it depends”. Things that impact replication performance include, but are not limited to:

  1. Change frequency
  2. Document size
  3. Number of simultaneous replication jobs on the cluster as a whole
  4. Wide (conflicted) document trees

Rule 14: Use the bulk API

Cloudant has nice API endpoints for bulk loading (and reading) many documents at once. This can be much more efficient than reading/writing many documents one at a time. The write endpoint is:

${database}/_bulk_docs.

Its main purpose is to be a central part in the replicator algorithm, but it’s available for your use, too, and it’s pretty awesome.

With _bulk_docs, in addition to creating new docs you can also update and delete. Some client libraries, including PouchDB, implement create, update and delete even for single documents this way for fewer code paths.

Here is an example creating one new, updating a second existing, and deleting a third document:

curl -XPOST 'https://ACCT.cloudant.com/DB/_bulk_docs' \ 
     -H "Content-Type: application/json" \ 
     -d '{"docs":[{"baz":"boo"}, \ 
         {"_id":"463bd...","foo":"bar"}, \      
         {"_id":"ae52d...","_rev":"1-8147...","_deleted": true}]}'

You can also fetch many documents at once by issuing a POST to _all_docs(there is also a newish endpoint called _bulk_get, but this is probably not what you want  —  it’s there for a specific internal purpose).

To fetch a fixed set of docs using _all_docs, POST with a keys body:

curl -XPOST 'https://ACCT.cloudant.com/DB/_all_docs' \ 
     -H "Content-Type: application/json" \ 
     -d '{"keys":["ab234....","87addef...","76ccad..."]}'

Note that Cloudant (at the time of writing) imposes a max request size of 1 MB, so _bulk_docs requests exceeding this size will be rejected.

Rule 15: Eventual Consistency is a harsh taskmaster (a.k.a. don’t read your writes)

Eventual consistency is a great idea on paper, and a key contributor to Cloudant’s ability to scale out in practice. However, it’s fair to say that the mindset required to develop against an eventually consistent data store does not feel natural to most people.

You often get stung when writing tests:

  1. Create a database
  2. Populate the database with some test data
  3. Query the database for some subset of this test data
  4. Verify that the data you got back is the data you expected to get back

Nothing wrong with that? That works on every other database you’ve ever used, right?

Not on Cloudant.

Or rather, it works 99 times out of 100.

The reason for this is that there is a (mostly) small inconsistency window between writing data to the database, and this being available on all nodes of the cluster. As all nodes in a cluster are equal in stature, there is no guarantee that a write and a subsequent read will be serviced by the same node, so in some circumstances the read may be hitting a node before the written data has made it to this node.

So why don’t you just put a short delay in your test between the write and the read? That will make the test less likely to fail, but the problem is still there.

Cloudant has no transactional guarantees, and whilst document writes are atomic (you’re guaranteed that a document can either be read in its entirety, or not at all), there is no way to close the inconsistency window. It’s there by design.

A more serious concern that should be at the forefront of every developer’s mind is that you can’t safely assume that data you write will be available to anyone else at a specific point in time. This takes some getting used to if you come from a different kind of database tradition.

Testing Tip: what you can do to avoid the inconsistency window in testing is to test against an single-node instance of Cloudant or CouchDB running say in Docker (docker stuff here). A single node removes the eventual consistency issue, but beware that you are then testing against an environment that behaves differently to what you will target in production. Caveat Emptor.

Rule 16: Don’t mess with Q, R and N unless you really know what you are doing

Cloudant’s quorum and sharding parameters, once you discover them, seem like tempting options to change the behaviour of the database.

Stronger consistency — surely just set the write quorum to the replica count?

No! Recall that there is no way to close the inconsistency window in a cluster.

Don’t go there. The behaviour will be much harder to understand especially during network partitions. If you’re using Cloudant-the-service, the default values are fine for the vast majority of users.

There are times when tweaking the shard count for a database is essential to get the best possible performance, but if you can’t say why this is, you’re likely to make your situation worse.

Rule 17: Design document (ddoc) management requires some flair

As your data set grows, and your number of views goes up, sooner or later you will want to ponder how you organise your views across ddocs. A single ddoc can be used to form a so-called view group: a set of views that belong together by some metric that makes sense for your use case. If your views are pretty static, that makes your view query URLs semantically similar for related queries. It’s also more performant at index time because the index loads the document once and generates multiple indexes from it.

Ddocs themselves are read and written using the same read/write endpoints as any other document. This means that you can create, inspect, modify and delete ddocs from within your application. However, even small changes to ddocs can have big effects on your database. When you update a ddoc, all views in it become unavailable until indexing is complete. This can be problematic in production. To avoid it you have to do a crazy ddoc-swapping dance (see couchmigrate).

In most cases, this is probably not what you want to have to deal with. As you start out, it is most likely more convenient to have a one-view-per-ddoc policy.

Also, in case it isn’t obvious, views are code and should be subject to the same processes you use in terms of source code version management for the rest of your application code. How to achieve this may not be immediately obvious. You could version the JS snippets and then cut & paste the code into the Cloudant dashboard to deploy whenever there is a change, and yes, we all resort to this from time to time.

There are better ways to do this, and this is one reason to use some of the tools surrounding the couchapp concept. A couchapp is a self-contained CouchDB web application that nowadays doesn’t see much use. Several couchapp tools exist that are there to make the deployment of a couchapp — including its views, crucially — easier.

Using a couchapp tool means that you can automate deployment of views as needed, even when not using the couchapp concept itself.

Rule 18: Cloudant is rate limited — let this inform your code

Cloudant-the-service (unlike vanilla CouchDB) is sold on a “reserved throughput capacity” model. That means that you pay for the right to use up to a certain throughput, rather than the throughput you actually end up consuming. This takes a while to sink in. One somewhat flaky comparison might be that of a cell phone contract where you pay for a set number of minutes regardless of whether you end up using them or not.

Although the cell phone contract comparison doesn’t really capture the whole situation. There is no constraint on the sum of requests you can make to Cloudant in a month; the constraint is on how fast you make requests.

It’s really a promise that you make to Cloudant, not one that Cloudant makes to you: you promise to not make more requests per second than what you said you would up front. A top speed limit, if you like. If you transgress, Cloudant will fail your requests with a status of 429: Too Many Requests. It’s your responsibility to look out for this, and deal with it appropriately, which can be difficult when you’ve got multiple app servers. How can they coordinate to ensure that they collectively stay below the requests-per-second limit?

Cloudant’s official client libraries have some built-in provision for this that can be enabled (note: this is switched off by default to force you to think about it), following a “back-off & retry” strategy. However, if you rely on this facility alone you will eventually be disappointed. Back-off & retry only helps in cases of temporary transgression, not a persistent butting up against your provisioned throughput capacity limits.

Your business logic must be able to handle this condition. Another way to look at it is that you get the allocation you pay for. If that allocation isn’t sufficient, the only solution is to pay for a higher allocation.

Provisioned throughput capacity is split into three different buckets: Lookups, Writes and Queries. A Lookup is a “primary key” read — fetching a document based on its _id. A Write is storing a document or attachment on disk, and a Query is looking up documents via a secondary index (any API endpoint that has a _design or _find in it).

You get different allocations of each and the ratios between them are fixed. This fact can be used to optimise for cost. As you get 20 Lookups for every 1 Query (per second), if you find that you’re mainly hitting the Query limit but you have plenty headroom in Lookups, it may be possible to reduce the reliance on Queries through some remodelling of the data or perhaps doing more work client-side.

The corollary here though is that you can’t assume that any 3rd party library or framework will optimise for cost ahead of convenience. Client-side frameworks that support multiple persistence layers via plugins are unlikely to be aware of this, or perhaps even incapable of make such trade-offs.

Checking this before committing to a particular tool is a good idea.

It is also worth understanding that the rates aren’t directly equivalent to HTTP API endpoint calls. You should expect that for example a bulk update will count according to its constituent document writes.

Thanks

Thanks to Glynn Bird, Robert Newson, Richard Ellis, Mike Broberg, and Jan Lehnardt for helpful comments. Any remaining errors are mine alone, and nothing to do with them.