33

What's the query or some other quick way to delete all the documents matching the where condition in a collection?
I want something like DELETE * FROM c WHERE c.DocumentType = 'EULA' but, apparently, it doesn't work.

Note: I'm not looking for any C# implementation for this.

GorvGoyl
  • 42,508
  • 29
  • 229
  • 225
  • 1
    As Gaurav Mantri mentioned, it is not supported currently. There is a [feedback](https://feedback.azure.com/forums/263030-documentdb/suggestions/6346033-set-based-operations-insert-update-delete) under review. – Tom Sun - MSFT Jun 27 '17 at 09:28
  • Interestingly this is possible using the Graph APIs as you can do `.drop()` on the result of a traversal to remove all selected docs.... – Jesse Carter Jun 27 '17 at 14:35
  • @JesseCarter could u describe in a bit detail – GorvGoyl Jun 27 '17 at 17:31
  • 1
    @JerryGoyal Unfortunately unless you're fully embracing graph I don't think it will work for your usecase as Cosmos expects a specific document format. But for example you could do something like `g.V().has('DocumentType', eq('EULA')).drop()` and clear out all docs that matched – Jesse Carter Jun 27 '17 at 19:16
  • where do i run this query – GorvGoyl Jun 28 '17 at 05:06

6 Answers6

21

This is a bit old but just had the same requirement and found a concrete example of what @Gaurav Mantri wrote about.

The stored procedure script is here:

https://social.msdn.microsoft.com/Forums/azure/en-US/ec9aa862-0516-47af-badd-dad8a4789dd8/delete-multiple-docdb-documents-within-the-azure-portal?forum=AzureDocumentDB

Go to the Azure portal, grab the script from above and make a new stored procedure in the database->collection you need to delete from.

Then right at the bottom of the stored procedure pane, underneath the script textarea is a place to put in the parameter. In my case I just want to delete all so I used:

SELECT c._self FROM c

I guess yours would be:

SELECT c._self FROM c WHERE c.DocumentType = 'EULA'

Then hit 'Save and Execute'. Viola, some documents get deleted. After I got it working in the Azure Portal I switched over the Azure DocumentDB Studio and got a better view of what was happening. I.e. I could see I was throttled to deleting 18 a time (returned in the results). For some reason I couldn't see this in the Azure Portal.

Anyway, pretty handy even if limited to a certain amount of deletes per execution. Executing the sp is also throttled so you can't just mash the keyboard. I think I would just delete and recreate the Collection unless I had a manageable number of documents to delete (thinking <500).

Props to Mimi Gentz @Microsoft for sharing the script in the link above.

HTH

Nexaspx
  • 371
  • 4
  • 20
Ben E G
  • 971
  • 8
  • 16
  • 2
    Thanks for this, it is helpful when you need to run it against a single partition. But where the query itself is cross partition the stored procedure option is not feasible as you can only run a store proc against a partition currently. – Gopal Krishnan Nov 12 '19 at 13:24
19

I want something like DELETE * FROM c WHERE c.DocumentType = 'EULA' but, apparently, it doesn't work.

Deleting documents this way is not supported. You would need to first select the documents using a SELECT query and then delete them separately. If you want, you can write the code for fetching & deleting in a stored procedure and then execute that stored procedure.

Gaurav Mantri
  • 128,066
  • 12
  • 206
  • 241
  • i have code for select now that selected data i want to delete from cosmos how to do that please guide with code snippet – Neo Oct 15 '20 at 14:59
7

I wrote a script to list all the documents and delete all the documents, it can be modified to delete the selected documents as well.

var docdb = require("documentdb");
var async = require("async");

var config = {
  host: "https://xxxx.documents.azure.com:443/",
  auth: {
    masterKey: "xxxx"
  }
};

var client = new docdb.DocumentClient(config.host, config.auth);

var messagesLink = docdb.UriFactory.createDocumentCollectionUri("xxxx", "xxxx");

var listAll = function(callback) {
  var spec = {
    query: "SELECT * FROM c",
    parameters: []
  };

  client.queryDocuments(messagesLink, spec).toArray((err, results) => {
    callback(err, results);
  });
};

var deleteAll = function() {
  listAll((err, results) => {
    if (err) {
      console.log(err);
    } else {
      async.forEach(results, (message, next) => {
        client.deleteDocument(message._self, err => {
          if (err) {
            console.log(err);
            next(err);
          } else {
            next();
          }
        });
      });
    }
  });
};

var task = process.argv[2];
switch (task) {
  case "listAll":
    listAll((err, results) => {
      if (err) {
        console.error(err);
      } else {
        console.log(results);
      }
    });
    break;
  case "deleteAll":
    deleteAll();
    break;

  default:
    console.log("Commands:");
    console.log("listAll deleteAll");
    break;
}
Norcino
  • 5,850
  • 6
  • 25
  • 42
user1172274
  • 111
  • 1
  • 4
  • 5
    Welcome to Stack Overflow! While links are great way of sharing knowledge, they won't really answer the question if they get broken in the future. Add to your answer the essential content of the link which answers the question. In case the content is too complex or too big to fit here, describe the general idea of the proposed solution. Remember to always keep a link reference to the original solution's website. See: [How do I write a good answer?](https://stackoverflow.com/help/how-to-answer) – sɐunıɔןɐqɐp Jul 26 '18 at 08:22
1

And if you want to do it in C#/Dotnet Core, this project may help: https://github.com/lokijota/CosmosDbDeleteDocumentsByQuery. It's a simple Visual Studio project where you specify a SELECT query, and all the matches will be a) backed up to file; b) deleted, based on a set of flags.

0

create stored procedure in collection and execute it by passing select query with condition to delete. The major reason to use this stored proc is because of continuation token which will reduce RUs to huge extent and will cost less.

Sarang Kulkarni
  • 367
  • 2
  • 6
-1
##### Here is the python script which can be used to delete data from Partitioned Cosmos Collection #### This will delete documents Id by Id based on the result set data.

Identify the data that needs to be deleted before below step

res_list = "select id from id_del"
res_id = [{id:x["id"]} 
             for x in sqlContext.sql(res_list).rdd.collect()]
config = {
   "Endpoint" : "Use EndPoint"
  "Masterkey" : "UseKey", 
      "WritingBatchSize" : "5000",
    'DOCUMENTDB_DATABASE': 'Database',
    'DOCUMENTDB_COLLECTION': 'collection-core'
}; 

for row in res_id:
# Initialize the Python DocumentDB client
  client = document_client.DocumentClient(config['Endpoint'], {'masterKey': config['Masterkey']})

# use a SQL based query to get   documents

## Looping thru partition to delete

  query = { 'query': "SELECT c.id FROM c where c.id = "+ "'" +row[id]+"'"   }
  print(query)
  options = {}
  options['enableCrossPartitionQuery'] = True
  options['maxItemCount'] = 1000
  result_iterable = client.QueryDocuments('dbs/Database/colls/collection-core', query, options)
  results = list(result_iterable)
  print('DOCS TO BE DELETED : ' + str(len(results)))
  if len(results) > 0 :
      for i in range(0,len(results)):
      #  print(results[i]['id'])
          docID = results[i]['id']
          print("docID :" + docID)
          options = {}
          options['enableCrossPartitionQuery'] = True
          options['maxItemCount'] = 1000
          options['partitionKey'] = docID
          client.DeleteDocument('dbs/Database/colls/collection-core/docs/'+docID,options=options)
          print ('deleted Partition:' +  docID)