1

I have a collection with around 44 million documents. I am adding new documents and at the same time deleting old ones. The deletion is currently slightly faster than the adding, so the number of documents is decreasing. I am currently down to 32 million.

Now the strange thing is that the collection's (data)size, not the storageSize is increasing all the time. With 44 million documents I had a size of roughly 3.1 TB. Now with 32 million documents the size is roughly 4 TB. The documents added and removed are more or less the same size.

Could there be a bug in mongo not updating the dataSize correctly? Somehow it is not recognising the document deletions. I understand that storageSize is not affected by document deletion, but dataSize should be. Can I somehow force mongo to recount its dataSize statistics?

As mentioned before, this question is about the actual data size. I do not care about the file size on the disk. I know that this does not shrink when I delete documents.

I am using a rather old version (2.6.10) of the mongo DB.

Community
  • 1
  • 1
Michael
  • 202
  • 2
  • 8
  • For future reference, these sort of questions belong on [dba.stackexchange.com](https://dba.stackexchange.com) when they are not actually related to programming itself. There are some such answers still around for historical reasons before the satellite sites were created. – Neil Lunn Aug 01 '17 at 10:46
  • @Neil Lunn Sorry, but those answers are all referring to the database size **on disk** (i.e., the so-called storageSize). My question is about the dataSize. According to mongo this is the actual size of my data. And that should descrease when I delete documents (without the need to do any repairDatabase operation or similar). Please re-open as this is not a duplicate. As I see it, the dataSize should always correspond to the size of the data. It can be obtained via `db.stats().dataSize` – Michael Aug 02 '17 at 09:19
  • Then go ask your question on the correct site. This is not a programming topic. – Neil Lunn Aug 02 '17 at 09:23
  • @Neil Lunn What is not a programming topic? I thought stackoverflow is a page to ask question about software-related issues. is it not? I need to know the correct dataSize of my collection. It seems that mongo is not returning the right size. Maybe I'm wrong, maybe it is. Or maybe I have to tell mongo to recount its data. If so, I don't know how. – Michael Aug 02 '17 at 09:33

0 Answers0