Cloudant blog Home Search

How to recover a deleted document

Introductionđź”—


This article describes how you might be able to recover data in Cloudant after it has been deleted or overwritten.

image

Photo by Joshua Coleman on Unsplash

Deleting a Cloudant document leaves behind a so-called tombstone - a shell of the original document containing only an _id/_rev pair and a _deleted: true flag. Soon after deletion (or after updating a document), the previous revision’s document body is removed in a process called “compaction”. This process runs automatically from time to time in the Cloudant service as an essential part of database maintenance. There is however, a short time window between updating/deleting a document and its body being compacted - if you know what you’re doing, and you’re quick, it’s possible to recover the old document body before it is lost forever.

Examplesđź”—


To follow the examples in this section, replace:

  • $USER with the Cloudant account name
  • $PASS with the password of $USER
  • $DB with the name of the database.

How to recover a document that has been deletedđź”—


Here are example steps which demonstrate how you might be able to recover a document after it has been deleted.

  1. Write a new document as an example:
curl -u $USER:$PASS -X POST https://$USER.cloudant.com/$DB \
      -H "Content-Type: application/json" \
      -d '{   
              "_id": "example",
              "data": "Your data here."
          }'

The output I got:

{"ok":true,"id":"example","rev":"1-4a5958602638984def83a2075a86bc7a"}

indicates that the revision of the new document in this example is: 1-4a5958602638984def83a2075a86bc7a

  1. Delete the document:
curl -u $USER:$PASS -X DELETE https://$USER.cloudant.com/$DB/example?rev=1-4a5958602638984def83a2075a86bc7a

The output I got:

{"ok":true,"id":"example","rev":"2-45a4676f3cae54b3e7346d3a09dda771"}

indicates that the new revision of the (now deleted) document in this example is: 2-45a4676f3cae54b3e7346d3a09dda771

  1. These command outputs confirm that the document is now deleted:
$ curl -s -u $USER:$PASS https://$USER.cloudant.com/$DB/example | jq .
{
  "error": "not_found",
  "reason": "deleted"
}
$ curl -s -u $USER:$PASS https://$USER.cloudant.com/$DB/example?deleted=true | jq .
{
  "_id": "example",
  "_rev": "2-45a4676f3cae54b3e7346d3a09dda771",
  "_deleted": true
}
  1. This command uses the revs_info=true parameter to get the status of the document revisions:
$ curl -s -u $USER:$PASS https://$USER.cloudant.com/$DB/example?deleted=true\&revs_info=true | jq .

Here is the output I got:

{
  "_id": "example",
  "_rev": "2-45a4676f3cae54b3e7346d3a09dda771",
  "_deleted": true,
  "_revs_info": [
    {
      "rev": "2-45a4676f3cae54b3e7346d3a09dda771",
      "status": "deleted"
    },
    {
      "rev": "1-4a5958602638984def83a2075a86bc7a",
      "status": "available"
    }
  ]
}

It shows that the revision which immediately precedes the deleted revision is 1-4a5958602638984def83a2075a86bc7a.
Because compaction of this document has not yet run since the document was deleted, revision 1-4a5958602638984def83a2075a86bc7a has status "available". If that revision were no longer available, its status would be "missing".

  1. If its status is "available" you can still get the contents of revision 1-4a5958602638984def83a2075a86bc7a:
$ curl -s -u $USER:$PASS https://$USER.cloudant.com/$DB/example?rev=1-4a5958602638984def83a2075a86bc7a | jq .
{
  "_id": "example",
  "_rev": "1-4a5958602638984def83a2075a86bc7a",
  "data": "Your data here."
}
  1. Now you can write the contents of the revision back just as it was before the document was deleted. Do not include the _rev field.
curl -u $USER:$PASS -X POST https://$USER.cloudant.com/$DB \
      -H "Content-Type: application/json"  \
      -d '{
              "_id": "example",
              "data": "Your data here." 
          }'

The output I got:

{"ok":true,"id":"example","rev":"3-a95d2245a9f11e5fa62390c600204d18"}     

indicates that the new revision in this example is: 3-a95d2245a9f11e5fa62390c600204d18

  1. This command confirms that the document is now live (not deleted):
$ curl -s -u $USER:$PASS https://$USER.cloudant.com/$DB/example | jq .
{
  "_id": "example",
  "_rev": "3-a95d2245a9f11e5fa62390c600204d18",
  "data": "Your data here."
}

How to recover a document that has been overwrittenđź”—


Here are example steps which demonstrate how you might be able to recover the original data after a document has been updated.

  1. Update the example document, replacing the original data with new data.
  curl -u $USER:$PASS -X POST https://$USER.cloudant.com/$DB \
          -H "Content-Type: application/json" \
          -d '{   
                  "_id": "example",
                  "_rev": "3-a95d2245a9f11e5fa62390c600204d18",
                  "data": "New data that replaces the original data."
              }'

The output I got:

{"ok":true,"id":"example","rev":"4-9515f5aa01411766cc8aed181af12c1c"}

indicates that the new document revision in this example is: 4-9515f5aa01411766cc8aed181af12c1c

  1. This command confirms that the original data in the document has been replaced by the new data:
$ curl -s -u $USER:$PASS https://$USER.cloudant.com/$DB/example | jq .
{
  "_id": "example",
  "_rev": "4-9515f5aa01411766cc8aed181af12c1c",
  "data": "New data that replaces the original data."
}
  1. This command uses the revs_info=true parameter to get the status of the document revisions now:
$ curl -s -u $USER:$PASS https://$USER.cloudant.com/$DB/example?revs_info=true | jq .

Here is the output I got:

{
  "_id": "example",
  "_rev": "4-9515f5aa01411766cc8aed181af12c1c",
  "data": "New data that replaces the original data.",
  "_revs_info": [
    {
      "rev": "4-9515f5aa01411766cc8aed181af12c1c",
      "status": "available"
    },
    {
      "rev": "3-a95d2245a9f11e5fa62390c600204d18",
      "status": "available"
    },
    {
      "rev": "2-45a4676f3cae54b3e7346d3a09dda771",
      "status": "deleted"
    },
    {
      "rev": "1-4a5958602638984def83a2075a86bc7a",
      "status": "available"
    }
  ]
}

It shows that the revision which immediately precedes the latest revision is 3-a95d2245a9f11e5fa62390c600204d18.
Because compaction of this document has not yet run since the document was updated, revision 3-a95d2245a9f11e5fa62390c600204d18 has status "available". If that revision were no longer available, its status would be "missing".

  1. If its status is "available" you can still get the contents of revision 3-a95d2245a9f11e5fa62390c600204d18:
$ curl -s -u $USER:$PASS https://$USER.cloudant.com/$DB/example?rev=3-a95d2245a9f11e5fa62390c600204d18 | jq .
{
  "_id": "example",
  "_rev": "3-a95d2245a9f11e5fa62390c600204d18",
  "data": "Your data here."
}
  1. Now you can write the contents of the revision back just as it was before the document was updated. This time you must include the latest _rev in the document you write.
curl -u $USER:$PASS -X POST https://$USER.cloudant.com/$DB \
      -H "Content-Type: application/json"  \
      -d '{
              "_id": "example",
              "_rev": "4-9515f5aa01411766cc8aed181af12c1c",
              "data": "Your data here." 
          }'

The output I got:

  {"ok":true,"id":"example","rev":"5-0bdac581d633b76a696c2b4b3972c87d"}

indicates that the new revision in this example is: 5-0bdac581d633b76a696c2b4b3972c87d `

  1. This command confirms that the document now contains the original data, as it was before it was updated:
curl -s -u $USER:$PASS https://$USER.cloudant.com/$DB/example | jq .
{
  "_id": "example",
  "_rev": "5-0bdac581d633b76a696c2b4b3972c87d",
  "data": "Your data here."
}

How to find what documents have been deleted or overwrittenđź”—


To find out what document ids have been deleted or overwritten, you can use the changes feed, which returns a list of changes that have been made to documents in the database, including insertions, updates, and deletions.

For example:

curl -s -u $USER:$PASS https://$USER.cloudant.com/$DB/_changes