1

I'm developing a system with a "Group" collection and many other document collections that need to refer to this Group document. There will only ever be ~10,000 groups, so using the 12-byte ObjectId leads to bloat in the size of these other collections.

I would ideally like a sort of cut-down ObjectId that can still be generated on distributed nodes and have a high chance of uniqueness but be maybe 6 bytes in size. I'm thinking just:

  • a 2-byte machine identifier,
  • a 2-byte process id,
  • a 2-byte hash of the seconds since the Unix epoch

Does something like this exist? or is this just a bad idea?

Update: I understand my suggestion above would lose the ability to sort by date.

Nic Cottrell
  • 9,401
  • 7
  • 53
  • 76
  • https://stackoverflow.com/questions/12211138/creating-custom-object-id-in-mongodb – MateenSheikh Jul 24 '17 at 09:15
  • 1
    I would have thought [Possibility of duplicate Mongo ObjectId's being generated in two different collections?](https://stackoverflow.com/questions/4677237/possibility-of-duplicate-mongo-objectids-being-generated-in-two-different-colle) was the more informative read here. The broad strokes being that "once you start tinkering" with your own design on something like this, then you start running into the very real possibility of "collisions". So that would be the main consideration, along with the fact that `ObjectId` "starts with" the timestamp part so it's monotonic, or "always increasing". – Neil Lunn Jul 24 '17 at 09:27
  • 1
    I always find it amusing when people make trivial edits to questions in response to a comment but do not acknowledge the person who gave the comment, by addressing as a comment. Ahh Semantics!. Speaking of which *"a 4-byte value representing the seconds since the Unix epoch,"* And there is a really good reason why that is 4-bytes. if you use a shorter integer it's going to fail. Remind me how big a number in two bytes again? How many seconds in a year? – Neil Lunn Jul 24 '17 at 09:48

0 Answers0