18

With Firebase real time database we can delete a huge list of items with one single command simply by calling remove () on the parent node (the node is deleted and all is children too).

But according to the documentation with Firestore (https://firebase.google.com/docs/firestore/manage-data/delete-data#collections ) :
to delete a Collection we have to code a batch that will to loop over all its documents and delete them one by one .

This is not efficient at all. Is it because Firestore is in beta version or is it structurally impossible to delete the full node (Collection) in one single call ?

LeeHunter
  • 71
  • 1
  • 1
  • 7
ThierryC
  • 1,794
  • 3
  • 19
  • 34

3 Answers3

24

The RTDB is able to do this because each database is local to a single region. In order to provide a serialized view, when you call remove(), the database stops all other work until the removal is complete.

This behavior has been the cause of several apparent outages: if a remove() call has to delete huge swaths of data, all other activity is effectively locked out until it completes. As a result even for RTDB users that want to delete large quantities of data we have recommended recursively finding and deleting documents in groups (CLI, node.js).

Firestore on the other hand is based on more traditional Google-style storage infrastructure where different ranges of keys are assigned dynamically to different servers (storage isn't actually backed by BigTable, but the same principles apply). This means that deleting data is no longer a necessarily a single region action and it becomes very expensive to effectively make the deletion appear transactional. Firestore transactions are currently limited to 100 participants and this means that any non-trivial transactional bulk deletion is impossible.

We're investigating how best to surface an API that does a bulk deletion without promising transactional behavior. It's straightforward to imagine how to do this from a mobile client, but as you've observed this wouldn't be efficient if all we did is embedded the loop and batch delete for you. We also don't want to make REST clients second-class citizens either.

Firestore is a new product and there are ton of things still to do. Unfortunately this just hasn't made the cut. While this is something we hope to address eventually I can't provide any timeline on when that would be.

In the meantime the console and the firebase command-line both provide a non-transactional means of doing this, e.g. for test automation.

Thanks for your understanding and thanks for trying Firestore!

Kato
  • 40,352
  • 6
  • 119
  • 149
Gil Gilbert
  • 7,722
  • 3
  • 24
  • 25
  • I think the example in the Firestore doc (linked by @toofoo in the question) is incorrect. ´resolve()´ should be called when ´(numDeleted == 0)´. – Leo Dec 31 '17 at 14:54
  • BTW, should upload currently be done as batch? Or could transactions be used? – Leo Dec 31 '17 at 14:58
  • And another question. You wrote "bulk deletion without promising transactional behavior". I need transactional behavior. Can I hope for that? – Leo Dec 31 '17 at 15:06
  • 1
    Deletion of large numbers of documents in a single transaction is not anything we're planning for. If you really need to remove large numbers of documents at the same time you may need to search for alternatives to deletion. For example: you could name which subcollection to use in a parent and then change the name when you want to logically delete the contents. – Gil Gilbert Jan 29 '18 at 17:48
  • 1
    @GilGilbert is it possible to change the name of collection or the id of document? if so , any link to how – Snake Mar 06 '18 at 06:46
  • Documents and collections can't be renamed. – Gil Gilbert Apr 10 '20 at 15:44
  • @GilGilbert any chance that collection removals are nearly ready for prime time? It'd be hugely helpful for some work we're doing; I'm in the midst of waiting for a 95k document collection to delete for the fourth time today, and each time takes >30 minutes from the UI, so there's a lot of wasted time. – bsplosion May 11 '21 at 23:23
  • Hi! We are working on adding a Recursive Delete API to our Server SDKs. This API is already available in Node (https://googleapis.dev/nodejs/firestore/latest/Firestore_.html#recursiveDelete) and will soon be added to the Java SDK. This API makes it easier to delete collections and uses new backend APIs for bulk data operations. It does, however, still use the SDK to collect the documents that need to be deleted and to then issue the delete requests. – Sebastian Schmidt May 14 '21 at 15:33
7

I was happily refactoring my app for Firestore from Realtime Database, enjoying the shorter code and simpler syntax, until I refactored the delete() functions! To delete a document with subcollections:

  • Create an array of promises.
  • get() a subcollection, that doesn't have further subcollections.
  • Iterate through a forEach() function to read each document in the subcollection.
  • Delete each document, and push the delete command into the array of promises.
  • Go on to the next subcollection and repeat this.
  • Use Promise.all(arrayOfPromises) to wait until all the subcollections have been deleted.
  • Then delete the top-level document.

With multi layers of collections and documents you'll want to make that a function, then call it from another function to get the next higher layer, etc.

You can see this in the console. To manually delete collections and documents, delete the right-most document, then delete the right-most collection, and so on working left.

Here's my code, in AngularJS. It only works if the top-level collection wasn't deleted before the subcollections.

$scope.deleteClip = function(docId) {
if (docId === undefined) {
docId = $scope.movieOrTvShow + '_' + $scope.clipInMovieModel;
}
$scope.languageVideos = longLanguageFactory.toController($scope.language) + 'Videos';
var promises = [];
firebase.firestore().collection($scope.languageVideos).doc($scope.movieOrTvShow).collection('Video Clips').doc(docId).collection('SentenceTranslations').get()
.then(function(translations) {
  translations.forEach(function(doc) {
    console.log(doc.id);
    promises.push(firebase.firestore().collection($scope.languageVideos).doc($scope.movieOrTvShow).collection('Video Clips').doc(docId).collection('SentenceTranslations').doc(doc.id).delete());
  });
});
firebase.firestore().collection($scope.languageVideos).doc($scope.movieOrTvShow).collection('Video Clips').doc(docId).collection('SentenceExplanations').get()
.then(function(explanations) {
  explanations.forEach(function(doc) {
    console.log(doc.id);
    promises.push(firebase.firestore().collection($scope.languageVideos).doc($scope.movieOrTvShow).collection('Video Clips').doc(docId).collection('SentenceExplanations').doc(doc.id).delete());
  });
});
Promise.all(promises).then(function() {
  console.log("All subcollections deleted.");
  firebase.firestore().collection($scope.languageVideos).doc($scope.movieOrTvShow).collection('Video Clips').doc(docId).delete()
  .then(function() {
    console.log("Collection deleted.");
    $scope.clipInMovieModel = null;
    $scope.$apply();
  })
  .catch(function(error) {
    console.log("Remove failed: " + error.message);
  });
})
.catch(function(error){
  console.log("Error deleting subcollections: " + error);
});
};

All that would have been one line in Realtime Database.

Thomas David Kehoe
  • 10,040
  • 14
  • 61
  • 100
5

This is the fastest way to delete all documents in a collection: mix between python delete collection loop and python batch method

def delete_collection(coll_ref, batch_size, counter):
    batch = db.batch()
    init_counter=counter
    docs = coll_ref.limit(500).get()
    deleted = 0

    for doc in docs:
        batch.delete(doc.reference)
        deleted = deleted + 1

    if deleted >= batch_size:
        new_counter= init_counter + deleted
        batch.commit()
        print("potentially deleted: " + str(new_counter))
        return delete_collection(coll_ref, batch_size, new_counter)
    batch.commit()

delete_collection(db.collection(u'productsNew'), 500, 0)

this delete all documents from collection "productNew" in blocks of 500, which is currently the maximum number of documents which can be passed to a commit. See Firebase write and transaction quotas.

You can get more sophisticated and handle also API errors, but this just works fine for me.