33

Is it possible to delete all/multiple documents available in a collection through the azure portal, Azure cosmos SQL Query or a power shell script ?

Cœur
  • 37,241
  • 25
  • 195
  • 267
armadillo.mx
  • 934
  • 1
  • 11
  • 17
  • Possible duplicate of [How to delete all the documents in DocumentDB through c# code](https://stackoverflow.com/questions/29137708/how-to-delete-all-the-documents-in-documentdb-through-c-sharp-code) – dumbchemistry Nov 14 '18 at 16:45

1 Answers1

33

The fastest way to delete all documents, in my experience, is to set "time to live" on the container to 1 sec. That will remove all documents. But be aware that this process takes some time, so if you set "time to live" back to unlimited too soon the documents that haven't been removed yet will reappear.

You can set the "time to live" under "Scale and Settings" for the container: url -> https://learn.microsoft.com/en-us/azure/cosmos-db/how-to-time-to-live

You could also create a stored procedure in the container and run that..url -> https://github.com/Azure/azure-cosmosdb-js-server/blob/master/samples/stored-procedures/bulkDelete.js

The stored procedure:

/**
 * A Cosmos DB stored procedure that bulk deletes documents for a given query.<br/>
 * Note: You may need to execute this stored procedure multiple times (depending whether the stored procedure is able to delete every document within the execution timeout limit).
 *
 * @function
 * @param {string} query - A query that provides the documents to be deleted (e.g. "SELECT c._self FROM c WHERE c.founded_year = 2008"). Note: For best performance, reduce the # of properties returned per document in the query to only what's required (e.g. prefer SELECT c._self over SELECT * )
 * @returns {Object.<number, boolean>} Returns an object with the two properties:<br/>
 *   deleted - contains a count of documents deleted<br/>
 *   continuation - a boolean whether you should execute the stored procedure again (true if there are more documents to delete; false otherwise).
 */
function bulkDeleteStoredProcedure(query) {
    var collection = getContext().getCollection();
    var collectionLink = collection.getSelfLink();
    var response = getContext().getResponse();
    var responseBody = {
        deleted: 0,
        continuation: true
    };

    // Validate input.
    if (!query) throw new Error("The query is undefined or null.");

    tryQueryAndDelete();

    // Recursively runs the query w/ support for continuation tokens.
    // Calls tryDelete(documents) as soon as the query returns documents.
    function tryQueryAndDelete(continuation) {
        var requestOptions = {continuation: continuation};

        var isAccepted = collection.queryDocuments(collectionLink, query, requestOptions, function (err, retrievedDocs, responseOptions) {
            if (err) throw err;

            if (retrievedDocs.length > 0) {
                // Begin deleting documents as soon as documents are returned form the query results.
                // tryDelete() resumes querying after deleting; no need to page through continuation tokens.
                //  - this is to prioritize writes over reads given timeout constraints.
                tryDelete(retrievedDocs);
            } else if (responseOptions.continuation) {
                // Else if the query came back empty, but with a continuation token; repeat the query w/ the token.
                tryQueryAndDelete(responseOptions.continuation);
            } else {
                // Else if there are no more documents and no continuation token - we are finished deleting documents.
                responseBody.continuation = false;
                response.setBody(responseBody);
            }
        });

        // If we hit execution bounds - return continuation: true.
        if (!isAccepted) {
            response.setBody(responseBody);
        }
    }

    // Recursively deletes documents passed in as an array argument.
    // Attempts to query for more on empty array.
    function tryDelete(documents) {
        if (documents.length > 0) {
            // Delete the first document in the array.
            var isAccepted = collection.deleteDocument(documents[0]._self, {}, function (err, responseOptions) {
                if (err) throw err;

                responseBody.deleted++;
                documents.shift();
                // Delete the next document in the array.
                tryDelete(documents);
            });

            // If we hit execution bounds - return continuation: true.
            if (!isAccepted) {
                response.setBody(responseBody);
            }
        } else {
            // If the document array is empty, query for more documents.
            tryQueryAndDelete();
        }
    }
}

You could then write a powershellscript to run that stored procedure.

UPDATE

I believe another advantage of setting the "Time to Live" is that it doesn't cost any RUs, but deleting with a sproc will.

David Klempfner
  • 8,700
  • 20
  • 73
  • 153
Zaphod
  • 1,412
  • 2
  • 13
  • 27
  • I set time to live to 1 second. Ran SELECT * FROM c and saw that no documents were coming back. Then a while later ran the query again and there were heaps of documents there. Is that normal? – David Klempfner Dec 19 '19 at 06:28
  • 1
    That depends. If all those document where created less than a second ago yes. But if you mean that you have turend off time to live and redone the Select and then get a lot of documents you thought was deleted by time to live, then please read the first paragraph of this answer again =) Short answer is, yes that is normal. – Zaphod Dec 19 '19 at 09:49
  • 1
    @DavidKlempfner See comment above.. forgot to @ you – Zaphod Dec 19 '19 at 14:42
  • For those curious what "this process takes some time" means... I just did this on a small collection of 4k. Nothing happened for about 30ish minutes, then everything was gone a minute later. – mrdavidkendall Oct 05 '20 at 16:32
  • @mrdavidkendall I'm not sure, but I think this is affected by how many RU's is allocated to the collection. – Zaphod Oct 06 '20 at 10:25
  • 1
    In the Comos emulator, setting TTL=1 removes the records, but when you set it back to the original value, the records come back. Furthermore, running the code above return no records. – user2233706 Aug 26 '21 at 19:42
  • @user2233706 Yes.. This behaviour is explained in the first paragraph of this post... – Zaphod Aug 30 '21 at 12:45
  • The above code does not work if running from the portal: https://stackoverflow.com/questions/69032264 – user2233706 Sep 02 '21 at 15:02
  • @user2233706 It works as it should.. But you have issues with how you've set up your partition keys. Any stored procedure will only work within one partition. As you only have one document per partition any procedure will only be able to delete one document at a time. If you don't change your architecture you will not be able to do bulk deletes with stored procedures. – Zaphod Sep 03 '21 at 12:44