Cloudant’s BigCouch is open-source
Brad Anderson
August 30, 2010

The Cloudant team is pleased to make available its ‘BigCouch’ software project as open source software under the Apache 2.0 license. We have been the beneficiary of countless open source projects while constructing this system, so it is only fitting that we share our efforts in hopes that people will not only find utility, but also assist us in making it better.
It has taken us a while to get to this point. The version we are opening has benefitted from Cloudant operating our systems in production for our customers for almost a year. The recent (and heavy) refactoring effort has encompassed our learning over this time, and we believe it is now time to share this system with the world. We will now be focusing our efforts on documenting and ease of use, as is the case with most newly opened projects.
What does it do? Think of BigCouch as a set of Erlang/OTP applications that allow you to create a cluster of CouchDBs that is distributed across many nodes/servers. Instead of one big honking CouchDB, the result is an elastic data store which is fully CouchDB API-compliant. The top is a picture of the standalone CouchDB setup that is common today. When you download BigCouch and build with make dev (see the README.md file), the result is the bottom picture, a three-node development cluster.

The clustering layer is most closely modeled after Amazon's Dynamo, with consistent hashing, replication, and quorum for read/write operations. CouchDB view indexing occurs in parallel on each partition, and can achieve impressive speedups as compared to standalone serial indexing.
Also contained within BigCouch is a tailor-made RPC server application called Rexi. Rexi better fits the needs of this distributed data store by dropping some unneeded overhead in rex, the RPC server that ships with Erlang/OTP. Rexi is optimized for the case when you need to spawn a bunch of remote processes. Cast messages are sent from the origin to the remote rexi server, and local processes are spawned from there, which is vastly more efficient than spawning remote processes from the origin.
So, let us know what you think. We will be hanging around Freenode IRC in the #cloudant channel. Pull the code, build it, install, and begin playing. Document, enhance, give back, benchmark, blog, tweet, …and enjoy.