Cloudant

The database designed for the web 

NoSQL Live From Boston

We're very excited to be joinin up with our friends at 10gen to sponsor the NoSQL Live conference right here in Boston, our backyard.  The official press release can be found here.

The event will take place on March 11th at the John Hancock hotel.  A number of us from Cloudant will be at the conference in both official (panel discussions, lightning talks) and unofficial (milling about, drinking) capacities.  If you are in the New England area or plan on coming to the New England area, you should register and attend the conference.  You can register here.

From the 10gen events site

About:

A one-size-fits-all approach to databases no longer applies. Relational databases have worked well - and will continue to - for highly transactional systems. But today's web applications require enormous scalability and performance. This has spurred the growth of a new class of databases that trade off some of the features of relational databases to offer high performance, ease of programming, high availability & the ability to scale in cloud environments. They are collectively called NoSQL or non-relational databases.

NoSQL Live from Boston is a full-day interactive conference that brings together the thinking in this space. Picking up where other NoSQL events have left off, NoSQL Live goes beyond understanding the basics of these databases to how they are used in production systems. It features panel discussions on use cases, lightning talks, networking session, and a NoSQL Lab where attendees can get a practical view of programming with NoSQL databases.

Interested in presenting or sponsorship opportunities? Contact meghan@10gen.com

Posted by Alan Hoffman 

Comments [0]

Welcome Dr. Dave Hardtke

The Cloudant team is growing! 

We're excited to welcome Dr. Dave Hardtke who has joined Cloudant as Director of Search.  Like many of us here at Cloudant, Dave began his career as an experimental particle physicist.  In over a decade of experience Dave illuminated the secrets of the universe at various experiments such as STAR (Brookhaven National Lab), ICECUBE (Antarctica), and NA44 (CERN).  Dave's seminal work lead to the discovery of the Quark Gluon Plasma, the 2005 physics story of the year.  More recently Dave has turned his talents in algorithms and analyses to search, where he has built his own startup stinkyteddy.com, a general purpose search engine combining multiple keyword-driven search feeds and intelligent semantic analysis to deliver the most timely search results regardless of source and content types.  Prior to that Dave served as Chief Scientist at Surf Canyon, a startup focusing on personalization of real-time search.

Dave is accelerating Cloudant Search.  As we add customers it has become clear that they are hungry not just for a flexible way to store thier data, but a powerful way to "discover" their data. Search is a critical component in nearly all data driven applications.  We are therefore expanding our efforts to integrate a robust search platform into our CouchDB-based, cloud database service.  Dave brings his expertise in internet and enterpise search technology to the project.

Please join me in welcoming Dave to the team.

Posted by Alan Hoffman 

Comments [1]

Post-Mortem on Overnight Downtime

After looking into the problems from last night, we believe we've found the cause. In the database cluster, we tried a new version of dynamic code loading in a distributed erlang environment.  We learned that this introduced a single point of failure in the database and we've since reverted the system to eliminate this problem.  We were running the test to improve performance for customers and with today's update we have resolved the issue.

Again, we're sorry to all those affected.  We appreciate your patience as we work to improve Cloudant's performance and service.

Posted by Alan Hoffman 

Comments [0]

Technical Difficulties

Hi to everyone out there in Cloudant land.  In the wee morning hours (eastern standard time) we had some technical difficulties with our database cluster.  We're really sorry to those of you who were affected.  We think we have found the issue and resolved it.  We want to do a full post-mortem before declaring vicotry, but operations should be back to normal now.

Again, we're very sorry to those people who saw problems.  Please let us know if you were affected.  We're working hard to prevent things like this from happening in the future.

Posted by Alan Hoffman 

Comments [0]

Cloudant.com Refresh: CouchDB in the Cloud

Our new front page just went live on Cloudant.com. The main goal of this redesign is to give more information about the service we are currently rolling out. While we are still in private beta, we are starting to accept more people onto our platform and therefore we are ready to tell you more about what we are doing. We are currently aiming at existing CouchDB users, and to them we are offering professional hosting of CouchDB in the cloud. 

The new design follows the AIDA principle, according to which a succesful marketing flow involves four main phases: attention (A), interest (I), desire (D), and action (A).

The “CouchDB in the Cloud” headline is designed to attract the attention of CouchDB users. The screenshot behind it is meant to evoke interest by giving a peak to the features available in our dashboard. Underneath it, the three “features” listed aim to create desire. They link our offering to CouchDB, by stating that we provide all the benefits of the database that our potential customers already love, but also highlight what we have to offer on top of that: a scalable and managed infrastructure and the chance to use it for free. Finally, the two buttons, placed in what the Gutenberg Rule calls the “terminal area”, provide options for action.

Let us know what you think about this new design.

Comments [2]

Changing the World, One Password at a Time

Since launching the beta we’ve received a bunch of feedback along the lines of “Uh... I can’t change my password, wtf?” Yes, wtf indeed. We knew we would be building this feature eventually but we didn’t want to hold up the release for it. Well, now we have it on our new Account Options page.

You can access it by selecting ‘account options’ from the user account pulldown on the top right of the page. Thanks for all the feedback and keep it coming.

Comments [0]

API Keys: No Really, We're Listening

Since Monday we’ve been sending out invites to the private beta (if you haven’t gotten yours yet, be patient, it’s coming). And we’ve gotten a lot of great feedback on how to make the service better. One suggestion that came from a few users was to allow database access to non-Cloudant users through API keys that can be shared with anyone. Well, your requests did not fall on deaf ears. For any of your databases you can now navigate to the permissions tab and create a database-specific API key and password. You can then set permissions for that key as if it were a Cloudant user. The screenshot below shows what that should look like. Thanks to everyone for the great suggestion. You keep the feature requests coming and we’ll get to them as fast as we can.

Posted by Alan Hoffman 

Comments [0]

Benchmarking CouchDB with Baracus

Baracus is a new project built specifically for benchmarking CouchDB. Using trusty httperf, Baracus makes it easy to create and perform a battery of tests based on simple configuration files. In this post I'll go into the details of a Baracus run and show you how to run your own benchmarks.

First, let's install Baracus. It depends on httperf so install it first either from source (check the README for compile options for httperf) or from your distribution. Note that Baracus uses gemcutter for gem hosting.

gem sources -a http://gemcutter.org

gem install baracus

Next, let's examine a simple Baracus config file.

httperf: "/usr/bin/httperf"

host: "localhost"

port: "5984"

user:

password:

db: "testdb"

report_url: "http://localhost:5984/bench"

name: "test123"

config:

  timeout: 30

  sessions: 100

  rate: 100

  doc_size: 100

  writes: 10

  reads: 10

  batchok: false

info:

  storage: "raid10, 4 disks, ebs"

  anything: "stuff"

  more: "foo"

This configuration file specifies where the httperf executable lives, the host, port, user, password, and database to use for the test. Baracus puts all of the results and wsesslog files back into CouchDB for later consumption, the report_url is where you define this database.

As I mentioned, Baracus uses httperf to perform benchmarks. Specifically it uses httperf's wsesslog feature. wsesslog is a simple file format that describes what httperf should do each session. Here's an example of an entry generated with Baracus that does an HTTP GET:

/testdb/a12bf8572f788824dfd5169e2eb496ab method=GET

Baracus creates two wsesslogs for each run, one for writes (POST) and one for reads (GET). Baracus generates these files based on a few items in your configuration. Specifically, the number of sessions, document size and how many reads or writes to do per session. Additionally in the configuration you can specify at what rate httperf should perform the operations within each wsesslog file. Baracus creates the wsesslog for writing with random data and creates the read file with random docs from _all_docs. Since Baracus uses _all_docs, it performs all the writes first. The above config will create 100 sessions with 10 writes and reads each at a rate of 100 per second. Each document will be roughly 100 bytes.

The info section at the end of the config contains any additional information you may want to have tag along with your results. You can put anything you want in there; I generally use this to describe the system I just ran the tests on.

Once you have a configuration you are happy with it's very simple to run your tests.

baracus baracus.yml

This will print the standard httperf output as well as a tab delimited version. If you want to checkout the wsesslog files they will be written to the directory you ran Baracus in as well as to the results document in the reporting database. Here's what the results look like in Couch.

Baracus is meant to make repeatable benchmarking easy. If you have any questions, bugs or would like to contribute check out Baracus on github.

Posted by Joe Williams 

Comments [0]

Cloudant at No:SQL (east)

Cloudant’s own Brad Anderson is helping to organize No:SQL (east), a conference centered around the latest nonrelational database systems. The conference will be held at the Georgia Tech Research Institute in Atlanta on Oct 28-30. The description from the website reads:

NoSQL East, a conference of non-relational data stores, aims to present the experiences of different companies using so-called “nosql” solutions in production.

The reasons for using these new systems are varied, but often involve scale that relational solutions cannot achieve. We have a new tool in our toolbelts, and as you will see, some people are using them quite effectively.

Join us in Atlanta as we discuss “Big Data” and the systems that are completely transforming how people look at data.

The organizers have a great venue picked out (courtesy of Georgia Tech) and an exciting group of speakers. A number of nonrelational systems will be covered, and Cloudant co-founder Mike Miller will be giving a talk on CouchDB. Anyone interested in the NoSQL space should really try to make it down to Atlanta to attend.

Cloudant is one of two primary sponsors of the event, the other being GTRI. Many, if not all, of us will be attending the conference. We’re always happy to talk about NoSQL, CouchDB, JQuery, The Bears winning the Superbowl, etc. You should come say hi.

More information about No:SQL (east) can be found at https://nosqleast.com/2009/. Check it out.

Update (10/22/09):  Since posting the above, a couple other great companies, including Basho and The Rackspace Cloud, have also joined on as sponsors for the event.

Posted by Alan Hoffman 

Comments [0]

“Well, how did I get here?” -Talking Heads

Dharmesh tells me that I’m supposed to write a blog post explaining how we got to where we are. I’ll try my best, but sometimes I feel like the guy in that Talking Heads song, looking around saying, “This is not my beautiful house. This is not my beautiful wife.” This company formed sometime between 2004 and 2008, but definitely around one of the rather large lunch tables on the first floor of the Stata Center at MIT. Adam, Mike, and I all worked in the same group in the physics department and every day we ate lunch around one of those oddly shaped tables. By January of 2008 came to the conclusion that we couldn’t leave MIT without starting a company. We didn’t really know what we wanted to do, but we did know 1) We were all fairly bright and industrious guys, 2) we enjoyed working together, and 3) we didn’t want to go look for “real” jobs.

Mike’s original idea was for an automatic tamale maker. I’m not kidding. That one got nixed when we found out we were way behind already.

Eventually we decided to play to our strengths, which were in large-scale data processing and analysis. Our high-energy physics experiments generated some of the largest data sets in the world. It was our job to sift through the terabytes and petabytes to find rare particles and strange interactions. This is what we did for fun. Turns out that we’re living in a data-centric world (more so every day) and many of the problems we had faced were popping up in the business world. What we wanted was a way to bring physics-style data analysis tools to the masses. All we had to do was build it.

Toward the end of March someone sent me Paul Graham’s essay “A Student’s Guide to Startups.” I had never heard of PG before, but his essay resonated with me, especially the part about getting to stop treading water. I stumbled upon YCombinator literally by clicking random links on Graham’s website. As it turned out, this YCombinator Company was pretty well known and respected, and they were asking for applications for startups that needed funding. Hey! We were a startup that needed funding and we were already in Boston. So we applied, almost on a whim. And to be honest, we were a little bit floored when we got accepted.

After a number of back-and-forths with PG and other YC companies, we decided to focus on building a database designed for the specific needs of the web (hence our tagline “A database for the web,” clever huh?) One that would scale horizontally, could run on commodity hardware in a fault tolerant fashion, with a simple interface, and a powerful built-in analysis engine. We stumbled on Apache CouchDB, which at the time was in its relative infancy, and were very impressed with what they had built, especially the natural JSON-over-HTTP interface. We decided to take Couch and build it into a true ‘cloud-ready’ database.

It’s been more than a year since we started working on Cloudant. Since then a huge community has sprung up around CouchDB. Our own Adam Kocoloski was tapped to be an official committer on the project. We were able to raise significant additional funding late last year, just as the economy was collapsing—a true testament to the vision of our amazing VC. The Cloudant Team has grown from 3 to 6 as we were able to convince a few top-notch hackers to help us build our vision.

We’ll soon be releasing a beta version of Cloudant: CouchDB-in-the-cloud. It seems as though we were on to something when we began tossing ideas around in the Stata Center 18 months ago. The nosql movement, which is gaining steam, indicates the need for a new type of database, one built for the specific needs of web applications. At Cloudant we hope to fill that need. If you want to learn more or sign-up to be notified of our impending release, click here.

Posted by Alan Hoffman 

Comments [1]