20

The official MongoDB driver offers a 'count' and 'estimated document count' API, as far as I know the former command is highly memory intensive so it's recommended to use the latter in situations that require it.

But how accurate is this estimated document count? Can the count be trusted in a Production environment, or is using the count API recommended when absolute accuracy is needed?

Neil Lunn
  • 148,042
  • 36
  • 346
  • 317
ChazMcDingle
  • 635
  • 2
  • 10
  • 18
  • 3
    @Neil: I don't think this is a duplicate of the other question - that one was asked in 2015, way before `estimatedDocumentCount()` existed, and only one answer there tangentially refers to that method. – Dan Dascalescu May 07 '19 at 06:52
  • Just use count(), it's depreciated, but it still works and it's faster than both. – ajsp Jun 15 '21 at 20:28

3 Answers3

21

Comparing the two, to me it's very difficult to conjure up a scenario in which you'd want to use countDocuments() when estimatedDocumentCount() was an option.

That is, the equivalent form of estimatedDocumentCount() is countDocuments({}), i.e., an empty query filter. The cost of the first function is O(1); the second is O(N), and if N is very large, the cost will be prohibitive.

Both return a count, which, in a scenario in which Mongo has been deployed, is likely to be quite ephemeral, i.e., it's inaccurate the moment you have it, as the collection changes.

Allan Bazinet
  • 1,752
  • 15
  • 15
  • 7
    Estimated count apparently is no good if you want to come up with the total number of docs satisfying some query. This is required e.g. when performing server-side pagination and you want to know the total number of pages. – Avius Oct 16 '19 at 21:41
  • 1
    The question was about `estimatedDocumentCount()`, which is not germane to a query. If you're looking for the total number of documents satisfying some query, then you (a) can't use `estimatedDocumentCount()` and (b) aren't going to use the no-arg version of `countDocuments()`. – Allan Bazinet Dec 17 '19 at 16:05
10

Please review the MongoDB documentation for estimatedDocumentCount(). Specifically, they note that "After an unclean shutdown of a mongod using the Wired Tiger storage engine, count statistics reported by db.collection.estimatedDocumentCount() may be inaccurate." This is due to metadata being used for the count and checkpoint drift, which will typically be resolved after 60 seconds or so.

In contrast, the MongoDB documentation for countDocuments() states that this method is a wrapper that performs a $group aggregation stage to $sum the results set, ensuring absolute accuracy of the count.

Thus, if absolute accuracy is essential, use countDocuments(). If all you need is a rough estimate, use estimatedDocumentCount(). The names are accurate to their purpose and should be used accordingly.

B. Fleming
  • 7,170
  • 1
  • 18
  • 36
0

The main difference is filtering.

count_documents can be filtered on like a normal query whereas estimated_document_count cannot be.

If filtering is not part of your use case then I would use estimated_document_count since it is much faster.

axelmukwena
  • 779
  • 7
  • 24
joeyagreco
  • 98
  • 2
  • 11