How to recover a deleted document

July 17, 2020 | Brian Wilkins | Compaction

Introduction

This article describes how you might be able to recover data in Cloudant after it has been deleted or overwritten.

image

Photo by Joshua Coleman on Unsplash

Some time after you delete or update a document in Cloudant, old revisions of the document are converted to tombstones by stripping out all of their fields except for _id, _rev and _deleted. The process that converts old revisions to tombstones is called “compaction” and it runs automatically from time to time in the Cloudant service as an essential part of database maintenance. Until a document revision is converted to a tombstone, you can recover its contents.

Examples

To follow the examples in this section, replace:

How to recover a document that has been deleted

Here are example steps which demonstrate how you might be able to recover a document after it has been deleted.

  1. Write a new document as an example:

     curl -u $USER:$PASS -X POST https://$USER.cloudant.com/$DB \
          -H "Content-Type: application/json" \
          -d '{   
                  "_id": "example",
                  "data": "Your data here."
              }'
    

    The output I got:

     {"ok":true,"id":"example","rev":"1-4a5958602638984def83a2075a86bc7a"}
    

    indicates that the revision of the new document in this example is: 1-4a5958602638984def83a2075a86bc7a

  2. Delete the document:

     curl -u $USER:$PASS -X DELETE https://$USER.cloudant.com/$DB/example?rev=1-4a5958602638984def83a2075a86bc7a
    

    The output I got:

     {"ok":true,"id":"example","rev":"2-45a4676f3cae54b3e7346d3a09dda771"}
    

    indicates that the new revision of the (now deleted) document in this example is: 2-45a4676f3cae54b3e7346d3a09dda771

  3. These command outputs confirm that the document is now deleted:

     $ curl -s -u $USER:$PASS https://$USER.cloudant.com/$DB/example | jq .
     {
       "error": "not_found",
       "reason": "deleted"
     }
    
     $ curl -s -u $USER:$PASS https://$USER.cloudant.com/$DB/example?deleted=true | jq .
     {
       "_id": "example",
       "_rev": "2-45a4676f3cae54b3e7346d3a09dda771",
       "_deleted": true
     }
    
  4. This command uses the revs_info=true parameter to get the status of the document revisions:

     $ curl -s -u $USER:$PASS https://$USER.cloudant.com/$DB/example?deleted=true\&revs_info=true | jq .
    

    Here is the output I got:

     {
       "_id": "example",
       "_rev": "2-45a4676f3cae54b3e7346d3a09dda771",
       "_deleted": true,
       "_revs_info": [
         {
           "rev": "2-45a4676f3cae54b3e7346d3a09dda771",
           "status": "deleted"
         },
         {
           "rev": "1-4a5958602638984def83a2075a86bc7a",
           "status": "available"
         }
       ]
     }
    

    It shows that the revision which immediately precedes the deleted revision is 1-4a5958602638984def83a2075a86bc7a.
    Because compaction of this document has not yet run since the document was deleted, revision 1-4a5958602638984def83a2075a86bc7a has status "available". If that revision were no longer available, its status would be "missing".

  5. If its status is "available" you can still get the contents of revision 1-4a5958602638984def83a2075a86bc7a:

     $ curl -s -u $USER:$PASS https://$USER.cloudant.com/$DB/example?rev=1-4a5958602638984def83a2075a86bc7a | jq .
     {
       "_id": "example",
       "_rev": "1-4a5958602638984def83a2075a86bc7a",
       "data": "Your data here."
     }
    
  6. Now you can write the contents of the revision back just as it was before the document was deleted. Do not include the _rev field.

     curl -u $USER:$PASS -X POST https://$USER.cloudant.com/$DB \
          -H "Content-Type: application/json"  \
          -d '{
                  "_id": "example",
                  "data": "Your data here." 
              }'
    

    The output I got:

     {"ok":true,"id":"example","rev":"3-a95d2245a9f11e5fa62390c600204d18"}     
    

    indicates that the new revision in this example is: 3-a95d2245a9f11e5fa62390c600204d18

  7. This command confirms that the document is now live (not deleted):

     $ curl -s -u $USER:$PASS https://$USER.cloudant.com/$DB/example | jq .
     {
       "_id": "example",
       "_rev": "3-a95d2245a9f11e5fa62390c600204d18",
       "data": "Your data here."
     }
    

How to recover a document that has been overwritten

Here are example steps which demonstrate how you might be able to recover the original data after a document has been updated.

  1. Update the example document, replacing the original data with new data.

     curl -u $USER:$PASS -X POST https://$USER.cloudant.com/$DB \
             -H "Content-Type: application/json" \
             -d '{   
                     "_id": "example",
                     "_rev": "3-a95d2245a9f11e5fa62390c600204d18",
                     "data": "New data that replaces the original data."
                 }'
    

    The output I got:

     {"ok":true,"id":"example","rev":"4-9515f5aa01411766cc8aed181af12c1c"}
    

    indicates that the new document revision in this example is: 4-9515f5aa01411766cc8aed181af12c1c

  2. This command confirms that the original data in the document has been replaced by the new data:

     $ curl -s -u $USER:$PASS https://$USER.cloudant.com/$DB/example | jq .
     {
       "_id": "example",
       "_rev": "4-9515f5aa01411766cc8aed181af12c1c",
       "data": "New data that replaces the original data."
     }
    
    
  3. This command uses the revs_info=true parameter to get the status of the document revisions now:

     $ curl -s -u $USER:$PASS https://$USER.cloudant.com/$DB/example?revs_info=true | jq .
    

    Here is the output I got:

     {
       "_id": "example",
       "_rev": "4-9515f5aa01411766cc8aed181af12c1c",
       "data": "New data that replaces the original data.",
       "_revs_info": [
         {
           "rev": "4-9515f5aa01411766cc8aed181af12c1c",
           "status": "available"
         },
         {
           "rev": "3-a95d2245a9f11e5fa62390c600204d18",
           "status": "available"
         },
         {
           "rev": "2-45a4676f3cae54b3e7346d3a09dda771",
           "status": "deleted"
         },
         {
           "rev": "1-4a5958602638984def83a2075a86bc7a",
           "status": "available"
         }
       ]
     }
    

    It shows that the revision which immediately precedes the latest revision is 3-a95d2245a9f11e5fa62390c600204d18.
    Because compaction of this document has not yet run since the document was updated, revision 3-a95d2245a9f11e5fa62390c600204d18 has status "available". If that revision were no longer available, its status would be "missing".

  4. If its status is "available" you can still get the contents of revision 3-a95d2245a9f11e5fa62390c600204d18:

     $ curl -s -u $USER:$PASS https://$USER.cloudant.com/$DB/example?rev=3-a95d2245a9f11e5fa62390c600204d18 | jq .
     {
       "_id": "example",
       "_rev": "3-a95d2245a9f11e5fa62390c600204d18",
       "data": "Your data here."
     }
    
  5. Now you can write the contents of the revision back just as it was before the document was updated. This time you must include the latest _rev in the document you write.

     curl -u $USER:$PASS -X POST https://$USER.cloudant.com/$DB \
          -H "Content-Type: application/json"  \
          -d '{
                  "_id": "example",
                  "_rev": "4-9515f5aa01411766cc8aed181af12c1c",
                  "data": "Your data here." 
              }'
    

    The output I got:

     {"ok":true,"id":"example","rev":"5-0bdac581d633b76a696c2b4b3972c87d"}
    

    indicates that the new revision in this example is: 5-0bdac581d633b76a696c2b4b3972c87d `

  6. This command confirms that the document now contains the original data, as it was before it was updated:

     curl -s -u $USER:$PASS https://$USER.cloudant.com/$DB/example | jq .
     {
       "_id": "example",
       "_rev": "5-0bdac581d633b76a696c2b4b3972c87d",
       "data": "Your data here."
     }
    

How to find what documents have been deleted or overwritten

To find out what document ids have been deleted or overwritten, you can use the changes feed, which returns a list of changes that have been made to documents in the database, including insertions, updates, and deletions.

For example:

curl -s -u $USER:$PASS https://$USER.cloudant.com/$DB/_changes