51

I am making a analytics system, the API call would provide a Unique User ID, but it's not in sequence and too sparse.

I need to give each Unique User ID an auto increment id to mark a analytics datapoint in a bitarray/bitset. So the first user encounters would corresponding to the first bit of the bitarray, second user would be the second bit in the bitarray, etc.

So is there a solid and fast way to generate incremental Unique User IDs in MongoDB?

est
  • 11,429
  • 14
  • 70
  • 118
  • I meet the same problem like yours, how to generate id to set bitset position. Did you solve this problem? – brucenan Nov 13 '15 at 06:59
  • Hope this can help you https://medium.com/@yesdeepakverma/implementing-sequence-types-in-mongodb-2de035582c23 – Deepak Verma Mar 20 '18 at 10:17
  • May be this can help you: https://www.mongodb.com/blog/post/generating-globally-unique-identifiers-for-use-with-mongodb – Guihgo Apr 28 '18 at 23:51

9 Answers9

34

As selected answer says you can use findAndModify to generate sequential IDs.

But I strongly disagree with opinion that you should not do that. It all depends on your business needs. Having 12-byte ID may be very resource consuming and cause significant scalability issues in future.

I have detailed answer here.

Community
  • 1
  • 1
expert
  • 29,290
  • 30
  • 110
  • 214
  • 1
    You can if you want to, I disagree too, because that's a mongo inbuilt feature to `.createIndex ( { "number" : 1 }, { unique : true } )` where the one represents increment and -1 otherwise – Tino Costa 'El Nino' Dec 25 '17 at 10:29
  • 3
    @TinoCosta'ElNino' What you are saying does not create an incremental field, it only creates an index on the filed `number` and that index is incremental and forces uniqueness, it doesn't in any way auto increment the field, or even require it or have it by default. – Amr Saber Jul 21 '20 at 14:50
  • 1
    Actually, regarding the answer itself, I don't see how can 12 bytes per document cause serious scaling problems for database/collection. changing from 12 bytes `_id` into 4 bytes (BJSON limit) with such a collection that would have scaling problems from 12 bytes is probably going to overflow after some time. Also, those bytes you save are equivalent to 8 characters of user input (if the collection contains user input, which is almost always the case), not at all worth the effort and all the benefits you lose. – Amr Saber Jul 25 '20 at 10:49
33

You can, but you should not https://web.archive.org/web/20151009224806/http://docs.mongodb.org/manual/tutorial/create-an-auto-incrementing-field/

Each object in mongo already has an id, and they are sortable in insertion order. What is wrong with getting collection of user objects, iterating over it and use this as incremented ID? Er go for kind of map-reduce job entirely

KęstutisV
  • 61
  • 1
  • 7
Konstantin Pribluda
  • 12,329
  • 1
  • 30
  • 35
  • 3
    The problem is concurrenty. iterate would emit duplicate incremental ID. – est Dec 05 '11 at 10:57
  • You need this ID only for analysis, not data storage. IIUC, you need sequential ID only as index for your array, so you can build bitset. You can achieve it without storing this incremental ID in database, and you can build your bit-array without retrieving data to client – Konstantin Pribluda Dec 05 '11 at 11:49
  • It's quite unwise to get incr ID by iteration everytime, especially you are dealing with millions of users per datapoint. Dong a MAU would require like 30x iterations. – est Dec 05 '11 at 13:22
  • 1
    It's unwise to use incremental sequences when you have millions of users in the first place. However, millions of users doesn't exactly play well with bit arrays either, does it? I find it hard to tell what exactly you're trying to achieve. Concurrency will not be a problem using `findAndModify`. Also see http://www.mongodb.org/display/DOCS/Object+IDs and the HiLo Algorithm: http://stackoverflow.com/questions/282099/whats-the-hi-lo-algorithm – mnemosyn Dec 05 '11 at 15:12
  • I just want to store some Redis bitmap data in Mongo http://blog.getspool.com/2011/11/29/fast-easy-realtime-metrics-using-redis-bitmaps/ for later queries. – est Dec 12 '11 at 03:23
  • It seems that you need mapping between some UID and bit in a redis bitset. If that is the case, you can also store this mapping in redis pretty effectively. And it would scale well, because mapping is write-one and read only – Konstantin Pribluda Dec 12 '11 at 06:49
  • Redis has RAM limitations, and not ideal for old data persistancy. MongoDB can handle much better. I really need to generate Unique Auto Incremental ID in MongoDB. – est Dec 12 '11 at 09:59
  • The thing is that I am too lazy to type more than a couple of cipher for an ID; and Mongo creates huge ids since the beginning. – Negarrak Sep 27 '18 at 21:18
  • As mentioned in the answer, mongo objectId's are not strictly sortable in insertion order. https://stackoverflow.com/questions/31057827/is-mongodb-id-objectid-generated-in-an-ascending-order – ns15 Jun 05 '19 at 12:34
  • _"What is wrong...?"_ sometimes the data is coming to mongo from another system that uses a different ID value and type that needs to be preserved. – StingyJack Feb 12 '21 at 15:23
  • Sometimes you just want a human readable auto incrementing id, regardless of scaling. Nobody has the right to say I "should not" want to do that. – Phil Nov 12 '22 at 23:53
  • OP does not ask if they should do it, it depends of their situation. – yılmaz Jan 24 '23 at 20:32
19

I know this is an old question, but I shall post my answer for posterity...

It depends on the system that you are building and the particular business rules in place.

I am building a moderate to large scale CRM in MongoDb, C# (Backend API), and Angular (Frontend web app) and found ObjectId utterly terrible for use in Angular Routing for selecting particular entities. Same with API Controller routing.

The suggestion above worked perfectly for my project.

db.contacts.insert({
 "id":db.contacts.find().Count()+1,
 "name":"John Doe",
 "emails":[
    "john@doe.com",
    "john.doe@business.com"
 ],
 "phone":"555111322",
 "status":"Active"
});

The reason it is perfect for my case, but not all cases is that as the above comment states, if you delete 3 records from the collection, you will get collisions.

My business rules state that due to our in house SLA's, we are not allowed to delete correspondence data or clients records for longer than the potential lifespan of the application I'm writing, and therefor, I simply mark records with an enum "Status" which is either "Active" or "Deleted". You can delete something from the UI, and it will say "Contact has been deleted" but all the application has done is change the status of the contact to "Deleted" and when the app calls the respository for a list of contacts, I filter out deleted records before pushing the data to the client app.

Therefore, db.collection.find().count() + 1 is a perfect solution for me...

It won't work for everyone, but if you will not be deleting data, it works fine.

Edit

latest versions of pymongo:

db.contacts.count() + 1
Jeril
  • 7,858
  • 3
  • 52
  • 69
Alex Nicholas
  • 325
  • 2
  • 6
  • is there any special reason that the mongodb states that you have to use a function and a counters sequence, instead of your soultion as db.xxx.find.count+1? does maybe transcation processing messing things? do your solution works well in web server environment cruds' ? thank you for your answer – ckinfos Sep 17 '17 at 07:37
  • 3
    This would not be good in a concurrent setup. You could easily get documents with the same _id if they did the count at the same time. – zephos2014 Jan 10 '18 at 21:47
  • 2
    absolutely! In my instance, I am not having to deal with concurrency or sharding at all so I don't have any problems using a find().Count()+1 As in my original answer, this won't work for everyone in every situation, but it has definitely has worked in my particular scenario. App has been in production now for nearly 12 months with no issues regarding my incrementing id's. – Alex Nicholas Mar 20 '18 at 23:12
  • This solution is bad, because you changed the history! In one time can be a document with id 3 and another time it can happen again. But there is not any relation between the documents that has this id's – ofir_aghai Oct 25 '19 at 06:26
  • 1
    It will be better if you get greatest id, instead of count – Amin Shojaei Jun 29 '20 at 10:34
  • ofir_aghai - I can see why you would think that, but as I said, There is no method to delete data in the app I was building, so no, there won't be an instance where a document would have the same ID as a previous document in the same collection. Amin Shojaei - yes, in a scenario where there could be deletes, getting the greatest Id would be better, so as my reply states, this was the right solution for me and the job I was doing - not suitable for EVERY solution, but perfect for mine. The app has been in production now for nearly 2 years without a problem. – Alex Nicholas Jul 27 '20 at 04:34
  • It's a very limited solution. Not being able to delete records seems too much of a sacrifice. – Sergei Oct 21 '21 at 19:42
  • 1
    As a software designer, you can't just say "I am not having to deal with concurrency". Your moderate to large CRM will fail, it's just a matter of time. As people above have rightly pointed out, if two users try to insert a document at the same time, you may get duplicated ids. – Phil Jul 30 '22 at 20:10
  • Except here we are - nearly 5 years later, with no collisions.. The app I built is not multi-user, so as I said, it wont work for everyone, but it was suitable for MY scenario. Because I will never have two users, I will never have 2 people entering the same data type at the same time - and therefor, doing a count works 'in my scenario' Would I do it that way now? probably not, but its what I did at the time, it worked, its been working for 5 years give or take... now I would certainly consider a singleton id creator that has access to the db and controls id creation. – Alex Nicholas Jun 06 '23 at 04:47
  • @Sergei - Yes it is limited, however in my requirements, I didn't sacrifice the ability to delete records - deleting records was strictly forbidden by the product owner. – Alex Nicholas Jun 06 '23 at 04:48
1

First Record should be add

"_id" = 1    in your db

$database = "demo";
$collections ="democollaction";
echo getnextid($database,$collections);

function getnextid($database,$collections){

     $m = new MongoClient();
    $db = $m->selectDB($database);
    $cursor = $collection->find()->sort(array("_id" => -1))->limit(1);
    $array = iterator_to_array($cursor);

    foreach($array as $value){



        return $value["_id"] + 1;

    }
 }
  • This will fail for empty collections. Also, this would take to much memory for big collections, because of fetching all the collection and sorting it. It Won't take too much processing because `_id` is indexed, but will take a lot of memory nevertheless. – Amr Saber Jul 25 '20 at 10:57
1

I had a similar issue, namely I was interested in generating unique numbers, which can be used as identifiers, but doesn't have to. I came up with the following solution. First to initialize the collection:

fun create(mongo: MongoTemplate) {
        mongo.db.getCollection("sequence")
                .insertOne(Document(mapOf("_id" to "globalCounter", "sequenceValue" to 0L)))
    }

An then a service that return unique (and ascending) numbers:

@Service
class IdCounter(val mongoTemplate: MongoTemplate) {

    companion object {
        const val collection = "sequence"
    }

    private val idField = "_id"
    private val idValue = "globalCounter"
    private val sequence = "sequenceValue"

    fun nextValue(): Long {
        val filter = Document(mapOf(idField to idValue))
        val update = Document("\$inc", Document(mapOf(sequence to 1)))
        val updated: Document = mongoTemplate.db.getCollection(collection).findOneAndUpdate(filter, update)!!
        return updated[sequence] as Long
    }
}

I believe that id doesn't have the weaknesses related to concurrent environment that some of the other solutions may suffer from.

Marian
  • 2,571
  • 2
  • 9
  • 8
  • 1
    There will be a time between fetching the last Id and creating a new document, these 2 operations are not atomic. In concurrent operations, you do not guarantee that non-atomic operations will be executed before other threads execute other operations. so the following can happen for 2 threads A and B: A gets id -> B gets Id -> B creates document -> A creates document. Which will cause database key inconsistency. – Amr Saber Jul 25 '20 at 11:06
  • The solution is synchronised on DB sequence using findOneAndUpdate which is atomic. So if the thread switch happens after getting ID, you get the following: 1) getting ID for doc A, idA=1; 2) getting ID for doc B, idB=2; 3) saving B {id:2}; 4) saving A {id:1}. It's not possible to introduce inconsistency. – Marian Jul 27 '20 at 06:55
  • You will have documents that are created later with lower ids than documents that were created earlier. Not a duplication error of course, but it can introduce problems when/if you depend on the order of the IDs (which is mostly why people use incremental IDs). That aside, I think this is one of the best solutions, it's just that the problem has no native support and thus has no clean totally working solution. – Amr Saber Jul 28 '20 at 10:40
  • Totally agree. I just didn't consider that as inconsistency. – Marian Jul 29 '20 at 10:02
1
// await collection.insertOne({ autoIncrementId: 1 });
const { value: { autoIncrementId } } = await collection.findOneAndUpdate(
  { autoIncrementId: { $exists: true } },
  {
    $inc: { autoIncrementId: 1 },
  },
);
return collection.insertOne({ id: autoIncrementId, ...data });
  • 1
    As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-ask). – Community Sep 13 '21 at 03:54
1

I used something like nested queries in MySQL to simulate auto increment, which worked for me. To get the latest id and increment one to it you can use:

lastContact = db.contacts.find().sort({$natural:-1}).limit(1)[0];
db.contacts.insert({
    "id":lastContact ?lastContact ["id"] + 1 : 1, 
    "name":"John Doe",
    "emails": ["john@doe.com", "john.doe@business.com"], 
    "phone":"555111322",
    "status":"Active"
})

It solves the removal issue of Alex's answer. So no duplicate id will appear if any record is removed.

More explanation: I just get the id of the latest inserted document, add one to it, and then set it as the id of the new record. And ternary is for cases that we don't have any records yet or all of the records are removed.

Amir Mehrnam
  • 346
  • 5
  • 5
0

this could be another approach

const mongoose = require("mongoose");

const contractSchema = mongoose.Schema(
  {
    account: {
      type: mongoose.Schema.Types.ObjectId,
      required: true,
    },
    idContract: {
      type: Number,
      default: 0,
    },
  },
  { timestamps: true }
);

contractSchema.pre("save", function (next) {
  var docs = this;
  mongoose
    .model("contract", contractSchema)
    .countDocuments({ account: docs.account }, function (error, counter) {
      if (error) return next(error);
      docs.idContract = counter + 1;
      next();
    });
});

module.exports = mongoose.model("contract", contractSchema);
Murat Akdeniz
  • 424
  • 5
  • 7
0
// First check the table length

const data = await table.find()
if(data.length === 0){
   const id = 1
   // then post your query along with your id  
}
else{
   // find last item and then its id
   const length = data.length
   const lastItem = data[length-1]
   const lastItemId = lastItem.id // or { id } = lastItem
   const id = lastItemId + 1
   // now apply new id to your new item
   // even if you delete any item from middle also this work
}
Anand
  • 1
  • 1