Cloudant blog Home Search

How to recover a deleted document

Introduction🔗


This article describes how you might be able to recover data in Cloudant after it has been deleted or overwritten.

image

Photo by Joshua Coleman on Unsplash

Deleting a Cloudant document leaves behind a so-called tombstone - a shell of the original document containing only an _id/_rev pair and a _deleted: true flag. Soon after deletion (or after updating a document), the previous revision’s document body is removed in a process called “compaction”. This process runs automatically from time to time in the Cloudant service as an essential part of database maintenance. There is however, a short time window between updating/deleting a document and its body being compacted - if you know what you’re doing, and you’re quick, it’s possible to recover the old document body before it is lost forever.

Examples🔗


To follow the examples in this section, replace:

  • $USER with the Cloudant account name
  • $PASS with the password of $USER
  • $DB with the name of the database.

How to recover a document that has been deleted🔗


Here are example steps which demonstrate how you might be able to recover a document after it has been deleted.

  1. Write a new document as an example:
curl -u $USER:$PASS -X POST https://$USER.cloudant.com/$DB \
      -H "Content-Type: application/json" \
      -d '{   
              "_id": "example",
              "data": "Your data here."
          }'

The output I got:

{"ok":true,"id":"example","rev":"1-4a5958602638984def83a2075a86bc7a"}

indicates that the revision of the new document in this example is: 1-4a5958602638984def83a2075a86bc7a

  1. Delete the document:
curl -u $USER:$PASS -X DELETE https://$USER.cloudant.com/$DB/example?rev=1-4a5958602638984def83a2075a86bc7a

The output I got:

{"ok":true,"id":"example","rev":"2-45a4676f3cae54b3e7346d3a09dda771"}

indicates that the new revision of the (now deleted) document in this example is: 2-45a4676f3cae54b3e7346d3a09dda771

  1. These command outputs confirm that the document is now deleted:
$ curl -s -u $USER:$PASS https://$USER.cloudant.com/$DB/example | jq .
{
  "error": "not_found",
  "reason": "deleted"
}
$ curl -s -u $USER:$PASS https://$USER.cloudant.com/$DB/example?deleted=true | jq .
{
  "_id": "example",
  "_rev": "2-45a4676f3cae54b3e7346d3a09dda771",
  "_deleted": true
}
  1. This command uses the revs_info=true parameter to get the status of the document revisions:
$ curl -s -u $USER:$PASS https://$USER.cloudant.com/$DB/example?deleted=true\&revs_info=true | jq .

Here is the output I got:

{
  "_id": "example",
  "_rev": "2-45a4676f3cae54b3e7346d3a09dda771",
  "_deleted": true,
  "_revs_info": [
    {
      "rev": "2-45a4676f3cae54b3e7346d3a09dda771",
      "status": "deleted"
    },
    {
      "rev": "1-4a5958602638984def83a2075a86bc7a",
      "status": "available"
    }
  ]
}

It shows that the revision which immediately precedes the deleted revision is 1-4a5958602638984def83a2075a86bc7a.
Because compaction of this document has not yet run since the document was deleted, revision 1-4a5958602638984def83a2075a86bc7a has status "available". If that revision were no longer available, its status would be "missing".

  1. If its status is "available" you can still get the contents of revision 1-4a5958602638984def83a2075a86bc7a:
$ curl -s -u $USER:$PASS https://$USER.cloudant.com/$DB/example?rev=1-4a5958602638984def83a2075a86bc7a | jq .
{
  "_id": "example",
  "_rev": "1-4a5958602638984def83a2075a86bc7a",
  "data": "Your data here."
}
  1. Now you can write the contents of the revision back just as it was before the document was deleted. Do not include the _rev field.
curl -u $USER:$PASS -X POST https://$USER.cloudant.com/$DB \
      -H "Content-Type: application/json"  \
      -d '{
              "_id": "example",
              "data": "Your data here." 
          }'

The output I got:

{"ok":true,"id":"example","rev":"3-a95d2245a9f11e5fa62390c600204d18"}     

indicates that the new revision in this example is: 3-a95d2245a9f11e5fa62390c600204d18

  1. This command confirms that the document is now live (not deleted):
$ curl -s -u $USER:$PASS https://$USER.cloudant.com/$DB/example | jq .
{
  "_id": "example",
  "_rev": "3-a95d2245a9f11e5fa62390c600204d18",
  "data": "Your data here."
}

How to recover a document that has been overwritten🔗


Here are example steps which demonstrate how you might be able to recover the original data after a document has been updated.

  1. Update the example document, replacing the original data with new data.
  curl -u $USER:$PASS -X POST https://$USER.cloudant.com/$DB \
          -H "Content-Type: application/json" \
          -d '{   
                  "_id": "example",
                  "_rev": "3-a95d2245a9f11e5fa62390c600204d18",
                  "data": "New data that replaces the original data."
              }'

The output I got:

{"ok":true,"id":"example","rev":"4-9515f5aa01411766cc8aed181af12c1c"}

indicates that the new document revision in this example is: 4-9515f5aa01411766cc8aed181af12c1c

  1. This command confirms that the original data in the document has been replaced by the new data:
$ curl -s -u $USER:$PASS https://$USER.cloudant.com/$DB/example | jq .
{
  "_id": "example",
  "_rev": "4-9515f5aa01411766cc8aed181af12c1c",
  "data": "New data that replaces the original data."
}
  1. This command uses the revs_info=true parameter to get the status of the document revisions now:
$ curl -s -u $USER:$PASS https://$USER.cloudant.com/$DB/example?revs_info=true | jq .

Here is the output I got:

{
  "_id": "example",
  "_rev": "4-9515f5aa01411766cc8aed181af12c1c",
  "data": "New data that replaces the original data.",
  "_revs_info": [
    {
      "rev": "4-9515f5aa01411766cc8aed181af12c1c",
      "status": "available"
    },
    {
      "rev": "3-a95d2245a9f11e5fa62390c600204d18",
      "status": "available"
    },
    {
      "rev": "2-45a4676f3cae54b3e7346d3a09dda771",
      "status": "deleted"
    },
    {
      "rev": "1-4a5958602638984def83a2075a86bc7a",
      "status": "available"
    }
  ]
}

It shows that the revision which immediately precedes the latest revision is 3-a95d2245a9f11e5fa62390c600204d18.
Because compaction of this document has not yet run since the document was updated, revision 3-a95d2245a9f11e5fa62390c600204d18 has status "available". If that revision were no longer available, its status would be "missing".

  1. If its status is "available" you can still get the contents of revision 3-a95d2245a9f11e5fa62390c600204d18:
$ curl -s -u $USER:$PASS https://$USER.cloudant.com/$DB/example?rev=3-a95d2245a9f11e5fa62390c600204d18 | jq .
{
  "_id": "example",
  "_rev": "3-a95d2245a9f11e5fa62390c600204d18",
  "data": "Your data here."
}
  1. Now you can write the contents of the revision back just as it was before the document was updated. This time you must include the latest _rev in the document you write.
curl -u $USER:$PASS -X POST https://$USER.cloudant.com/$DB \
      -H "Content-Type: application/json"  \
      -d '{
              "_id": "example",
              "_rev": "4-9515f5aa01411766cc8aed181af12c1c",
              "data": "Your data here." 
          }'

The output I got:

  {"ok":true,"id":"example","rev":"5-0bdac581d633b76a696c2b4b3972c87d"}

indicates that the new revision in this example is: 5-0bdac581d633b76a696c2b4b3972c87d `

  1. This command confirms that the document now contains the original data, as it was before it was updated:
curl -s -u $USER:$PASS https://$USER.cloudant.com/$DB/example | jq .
{
  "_id": "example",
  "_rev": "5-0bdac581d633b76a696c2b4b3972c87d",
  "data": "Your data here."
}

How to find what documents have been deleted or overwritten🔗


To find out what document ids have been deleted or overwritten, you can use the changes feed, which returns a list of changes that have been made to documents in the database, including insertions, updates, and deletions.

For example:

curl -s -u $USER:$PASS https://$USER.cloudant.com/$DB/_changes