89

How do I truncate a collection in MongoDB or is there such a thing?

Right now I have to delete 6 large collections all at once and I'm stopping the server, deleting the database files and then recreating the database and the collections in it. Is there a way to delete the data and leave the collection as it is? The delete operation takes very long time. I have millions of entries in the collections.

iefpw
  • 6,816
  • 15
  • 55
  • 79
  • How are you doing the delete operation (when it is taking a very long time)? – Stennie May 11 '13 at 07:47
  • 13
    db.collection.remove({ }); – iefpw May 11 '13 at 18:14
  • 4
    Ah! The `remove()` command will be **much** slower than `drop()` for a large collection because it does the extra housekeeping of updating indexes as documents are deleted. If you are deleting all documents in a collection then `drop()` is generally the best approach. The caveat with dropping & recreating the same collection is that you will also need to re-ensure any secondary indexes. – Stennie May 11 '13 at 22:02

7 Answers7

93

To truncate a collection and keep the indexes use

 db.<collection>.remove({})
astroanu
  • 3,901
  • 2
  • 36
  • 50
  • 1
    This works, but index sizes stay the same. Is there a way to truncate the indexes as well, but keep them anyway? – logic Nov 10 '16 at 10:13
  • notice, however, tha remove() is much slower than drop(). – rlib Nov 30 '16 at 22:27
  • @rlib yes it should be :) – astroanu Dec 01 '16 at 06:09
  • 3
    @astroanu Yes, surely. Last time I deleted 15 billion docs with remove() and it took very long time... Actually, inacceptably long. So drop() , createCollection() and createIndex() is the only way for truncate operation. – rlib Dec 01 '16 at 14:02
61

You can efficiently drop all data and indexes for a collection with db.collection.drop(). Dropping a collection with a large number of documents and/or indexes will be significantly more efficient than deleting all documents using db.collection.remove({}). The remove() method does the extra housekeeping of updating indexes as documents are deleted, and would be even slower in a replica set environment where the oplog would include entries for each document removed rather than a single collection drop command.

Example using the mongo shell:

var dbName = 'nukeme';
db.getSiblingDB(dbName).getCollectionNames().forEach(function(collName) {
    // Drop all collections except system ones (indexes/profile)
    if (!collName.startsWith("system.")) {
        // Safety hat
        print("WARNING: going to drop ["+dbName+"."+collName+"] in 5s .. hit Ctrl-C if you've changed your mind!");
        sleep(5000);
        db[collName].drop();
    }
})

It is worth noting that dropping a collection has different outcomes on storage usage depending on the configured storage engine:

  • WiredTiger (default storage engine in MongoDB 3.2 or newer) will free the space used by a dropped collection (and any associated indexes) once the drop completes.
  • MMAPv1 (default storage engine in MongoDB 3.0 and older) will not free up preallocated disk space. This may be fine for your use case; the free space is available for reuse when new data is inserted.

If you are instead dropping the database, you generally don't need to explicitly create the collections as they will be created as documents are inserted.

However, here is an example of dropping and recreating the database with the same collection names in the mongo shell:

var dbName = 'nukeme';

// Save the old collection names before dropping the DB
var oldNames = db.getSiblingDB(dbName).getCollectionNames();

// Safety hat
print("WARNING: going to drop ["+dbName+"] in 5s .. hit Ctrl-C if you've changed your mind!")
sleep(5000)

db.getSiblingDB(dbName).dropDatabase();

// Recreate database with the same collection names
oldNames.forEach(function(collName) {
    db.getSiblingDB(dbName).createCollection(collName);
})
Stennie
  • 63,885
  • 14
  • 149
  • 175
  • 1
    I forgot to note that if you drop & recreate the collection you will also need to add any secondary indexes. You can list the existing index definitions with `db.system.indexes.find()`. – Stennie May 11 '13 at 21:57
  • Won't this wreak havoc when performed on a partitioned/sharded collection? – zamnuts Sep 23 '15 at 20:33
  • @zamnuts Thanks for commenting! There was a subsequent issue found with dropping & reusing namespaces in a sharded cluster: [SERVER-17397 - dropping a Database or Collection in a Sharded Cluster may not fully succeed](https://jira.mongodb.org/browse/SERVER-17397). This was reported against MongoDB 2.6+ (unconfirmed if earlier versions have the same issue). There is a workaround noted on that issue which involves some extra steps to ensure the config server has been updated and the `mongos` cache is cleared before the namespaces are recreated. – Stennie Sep 23 '15 at 20:48
  • In practice, this is a bad idea. The problem is that dropping a collection also drops its indexes, including _id. If you attempt to drop and re-use a collection immediately, mongodb seems to background the index destruction and you can get the obscure error message 'Operation aborted because: all indexes on collection dropped'. It is safer to use remove(). – Wheezil Feb 16 '18 at 00:54
  • @Wheezil Your issue sounds like something separate; perhaps dropping a collection when it is actively being used (which invalidates open cursors that might be reading from an index). Dropping and reusing collections in a sharded cluster is not a great idea until SERVER-17937 has been resolved. Dropping & reusing a collection in a replica set or standalone deployment is safe and faster than removing documents and index entries. If you do have a specific case to investigate, can you post a new question with relevant details on DBA StackExchange? Much has changed since this was posted in 2013. – Stennie Feb 16 '18 at 05:09
  • @Stennie I did discuss this with MongoDB support. They do not think it is possible for my observed issue to occur in strictly single-threaded operation because the database is locked during table drop. I have yet to confirm this observation in a more controlled environment, for the time being I am playing it safe. I'll revise my observation to "if there is any possibility of multi-thread access..." until I have a chance to reproduce this more carefully. – Wheezil Feb 17 '18 at 16:39
  • @Wheezil I'm a member of the MongoDB Technical Services team ;-). The issue you reported appears to be due to dropping a collection while a bulk write operation was in progress. This error message is only set in one place in the server code: when all indexes are dropped and [open cursors are invalidated](https://github.com/mongodb/mongo/blob/6408164d14181b8717bdcb462456a90c16020a42/src/mongo/db/catalog/index_catalog.cpp#L821). Any operations which have an open cursor on an index will be aborted when the index is dropped, so the bulk write operation must have started before the collection drop. – Stennie Feb 17 '18 at 22:37
  • @Wheezil If the bulk write was definitely started in the same thread after the collection drop command returned, you should follow up on the CS case with more details (code example and log file). – Stennie Feb 17 '18 at 22:41
  • @Stennie I've attempted to recreate this error in a single-threaded scenario and I am unable to do so, so I retract my original statement. It seems that care must be exercised only in multi-thread scenarios to avoid simultaneous insert/drop. – Wheezil Feb 19 '18 at 14:57
  • 1
    Dropping collection locks the database where it sits. So doing this in a heavily used prod database will results in performance impact because of ticket starvation. – Jeep87c Nov 14 '18 at 20:47
19

the below query will delete all records in a collections and will keep the collection as is,

db.collectionname.remove({})
Daniel W.
  • 31,164
  • 13
  • 93
  • 151
IAmHomes
  • 513
  • 2
  • 11
  • 20
15

remove() is deprecated in MongoDB 4.

You need to use deleteMany or other functions:

db.<collection>.deleteMany({})
Chalist
  • 3,160
  • 5
  • 39
  • 68
6

There is no equivalent to the "truncate" operation in MongoDB. You can either remove all documents, but it will have a complexity of O(n), or drop the collection, then the complexity will be O(1) but you will loose your indexes.

Alon
  • 10,381
  • 23
  • 88
  • 152
4

Create the database and the collections and then backup the database to bson files using mongodump:

mongodump --db database-to-use

Then, when you need to drop the database and recreate the previous environment, just use mongorestore:

mongorestore --drop

The backup will be saved in the current working directory, in a folder named dump, when you use the command mongodump.

vinipsmaker
  • 2,215
  • 2
  • 19
  • 33
0

The db.drop() method obtains a write lock on the affected database and will block other operations until it has completed.

I think using the db.remove({}) method is better than db.drop().

Adrian W
  • 4,563
  • 11
  • 38
  • 52