0

I created an application using mongodb and never set an _id field and so am defaulting to mongo's objectId field.

I cant generate an _id field now.

Is there a way to customize how mongo generates the objectId for a specific collection?

I want to change it to a unix timestamp object to ensure uniqueness.

user1175817
  • 449
  • 2
  • 7
  • 17
  • Is there a reason that the default _id doesn't ensure uniqueness for you? – cptroot Aug 01 '13 at 18:52
  • it does ensure uniqueness. my client wants a more user friendly looking reference – user1175817 Aug 01 '13 at 18:56
  • There is no mechanism to change how _id is generated for a collection inside pymongo. The chances of an ObjectId collision are small see: http://stackoverflow.com/a/5694803/156427 – Ross Aug 19 '13 at 10:10

2 Answers2

1

I'm pretty sure you cannot customize how mongo generates the objectId for a specific collection (short of modifying the source code and then rebuilding). You can certainly change the _id field though. Here's a quick and dirty code snippet demonstrating this:

import pymongo
import time   
conn = pymongo.MongoClient()

def check_uniqueness(id):
    if conn['test']['test'].find({'_id':id}.count() > 1
        return False
    return True

def main()
    while True:
         proposed_id = time.time()
         if check_uniqueness(proposed_id):
               conn['test']['test'].insert({'_id': proposed_time})
               break

Using a timestamp might not be the best idea especially if you are connecting to your mongo instance form multiple machines whose clocks aren't synchronized. You could very easily generate conflicts, especially if your mongo instance is doing a lot of writes.

the_man_slim
  • 1,155
  • 2
  • 11
  • 18
  • 1
    For best performance, it might be better to catch a failure of an `insert` for a duplicate `_id` rather than try to `find` it, and still have a race condition where a second server (or another thread) could insert a document with the same `_id/timestamp`. – WiredPrairie Aug 02 '13 at 17:55
1

I want to change it to a unix timestamp object to ensure uniqueness.

Unix timestamps are not as unique as you think they are as stated by wikipedia (implementation sometimes differ): http://en.wikipedia.org/wiki/Unix_time

Unix time, or POSIX time, is a system for describing instants in time, defined as the number of seconds that have elapsed since 00:00:00 Coordinated Universal Time (UTC),

It is because of the granularity of the UNIX timestamp that the ObjectId has an additional inc compound to it, if an operation occurs the same second that inc is increased. It is actually quite common for that inc to be increased in a large database.

If you use timestamp alone you WILL face problems.

Instead I would recommend you either:

  • House two IDs, one user friendly and one not
  • look for something else to replace it, I do not know enough of your scenario to tell you what

Is there a way to customize how mongo generates the objectId for a specific collection?

As the answer by @the_man_slim shows in python, you can insert your own id however, you cannot update the _id field so beware of that.

Sammaye
  • 43,242
  • 7
  • 104
  • 146