4

I'm searching a long time and a lot of topics about this issue. Until now I couldn't find any solution. Moreover, it's not all clear to me, hopefully you can help. Here's my question:

I designed a Meteor application and there is a collection in the Mongo DB with orders. That collection is filled by reading a csv file

import_file_orders = function(file) {
var lines = file.split('%\r\n');
var l = lines.length - 1;
for (var i=0; i < l; i++) {
  var line = lines[i];
  var line_parts = line.split('|');
  var ex_key = line_parts[0];
  var ex_name = line_parts[1];
  var clin_info = line_parts[2];
  var order_info = line_parts[3];
  var clinician_last_name = line_parts[4];
  var clinician_first_name = line_parts[5];
  var clinician_code = line_parts[6];
  var clinician_riziv = line_parts[7]
  var pat_id = line_parts[8];
  Meteor.orders.insert({Patient:pat_id, Exam_code:ex_key, Exam_name:ex_name, Clinical_info:clin_info, Order_info:order_info, Clinician:{first:clinician_first_name, last:clinician_last_name, c_code:clinician_code, riziv:clinician_riziv}, Planned:null});
  console.log("%");
};
}

After reading the CSV file there are errors for some documents in the collection:

duplicate key error index: protocolplanner.Orders.$_id_ dup key: { :     "2ZGvRfuD8iMvRiXJd" } insert failed

When I run the Mongo command db.Orders.getIndexes() I see that there are two indexes:

{
  "v" : 1,
  "key" : {
           "_id" : 1
   },
   "name" : "_id_",
   "ns" : "protocolplanner.Orders"
 }

It seems that there are two indexes: one _id index (which is always there and can't be deleted) and one _ id _ index. It seems that the _ id _ index caused the error. So I have three questions:

First: Why is there an _ id _ index? I defined no indexes in my Meteor code. Second: Why is there a dup key error for that index? Third: Also it seems that I can't remove the _ id _ index. Why is it? I now that you can't remove a _id index but in my opinion this isn't an _id index.

As you can see I'm totally lost. Please help!

EDIT:

As commented below, a little bit more info:

The amount of data I'm reading is 10151 lines. The function that reads the file in defined client side. Via allow and deny rules only admin users can enter the data into mongo. The lines are read correctly. After reading the file all data is available in the app. After a few seconds the index is automatically created by Mongo and the error appear. From then the lines for which the error occurs aren't visible anymore in the app.

I tried the following in the Mongo shell: db.Orders.find({_id:"2ZGvRfuD8iMvRiXJd"})

Mongo gives me the right document. This proves that the _id indeed is created by Meteor when the data is inserted in the DB. However this _id should be unique so I'm totally confused about the error I have.

EDIT 2: After some trial and error I have some new information about this problem. Maybe It's interesting to know, so we can find an answer on this problem.

Like described above, when I read the data on client side, I have the duplicate key error, even when I use ObjecID instead of Meteor ID. However when I push the data directly into Mongo via mongoinsert command, all the data is imported well and no error occurs. It seems that there's a conflict between server and client when I insert this amount of data (maybe asynchronous timing issues).

At this moment I'm searching a solution to read the data server side in the hope no error occurs.

Quantum
  • 143
  • 4
  • 10
  • Well please don't ask "three questions" in this space since the format is "one question" to be reconciled to an accepted answer. With respect to 1 and 3 though, the `_id` field in MongoDB is a "primary key". It is considered the "ultimate unique identifier" of a document in itself. MongoDB itself has it's own concept of how this "unique" value is generated. Meteor chooses to "replace" this with it's own definition. If your code is truly as shown then the "intent" is that the "primary key" be unique for each document created. – Neil Lunn Mar 10 '15 at 10:36
  • OK, I'm sorry for the three questions. However, since the primary key is unique I don't understand why there is a duplicate key error. – Quantum Mar 10 '15 at 10:44
  • You and me both if this is client side code and not being called in an asynchronous context. The main intent here is for a "unique value" given that constraint for a primary key. You could perhaps edit your question to include more information of whether your operation is client or server ( guessing server ). Also the amount of data you are processing. Anything generating the same primary key value is worrying. But if you could add more information to explain this then it would be helpful. – Neil Lunn Mar 10 '15 at 10:52
  • The amount of lines in the file that I'm reading is 10151. When they are just read, every line of data is available in the app. After a few seconds the error appears in the console (after generating the index). The lines for which there are errors aren't visible in the app after that. The function that reads the file is defined client side. When I do a search in the Mongo console with the _id field in the query (e.g.: db.Orders.find({_id:"2ZGvRfuD8iMvRiXJd"})) then I find the right data. This proves that the _id indeed is created by Meteor during inserting the data.Normally it should be unique – Quantum Mar 10 '15 at 13:29
  • Agreed on the principle that it should be unique. The information would be better placed within the question than in a comment. There is an edit link on your question. Use it to add you additional details. – Neil Lunn Mar 10 '15 at 13:31

2 Answers2

2

This won't solve your problem, but it should point you in the right direction and maybe enable you to isolate the problem that you can use to create a new question:

First: Why is there an _id_ index?

There isn't. There's only one index, and it has a name and a key descriptor. That's not the same thing. The name of the default index is _id_, its key is _id.

Why is there a dup key error for that index?

The _id is normally created client-side, not server-side. The question is where those keys come from, because 2ZGvRfuD8iMvRiXJd is certainly not an ObjectId. This might be a meteor key, or you use some custom primary key, but I don't know how these keys are generated. Maybe whatever generates the key is susceptible to collisions?

More info on that would be helpful, but I'd suggest phrasing a new question so the question doesn't grow too large or get a lot of history.

Third: Also it seems that I can't remove the _ id _ index. Why is it?

That's a lemma of the first answer: You can't delete the mandatory primary key index.

Edit:

Meteor, by default, generates ids in a different way than MongoDb. That makes sense, because the convention for ObjectId makes collisions probable if the number of clients is large (i.e. if the clients aren't server instances, but client browsers, of which there are probably 2-3 orders of magnitude more).

Instead, Meteor apparently uses method to consistently generate pseudo-random numbers on client and server. Irritatingly, the implementation uses a PRNG and falls back to a not crypto-strong deterministic random number generator (Alea). In other words, finding out how exactly your ids are being generated could be tricky because it depends on a lot of details of your environment.

Workaround: Try to use ObjectId as a primary key:

Orders= new Meteor.Collection('Orders', {idGeneration: 'MONGO'});
Community
  • 1
  • 1
mnemosyn
  • 45,391
  • 6
  • 76
  • 82
  • I'm sorry. How does this answer the essential question that was addressed in dilution of "Why is the primary key the same for multiple records". Sure the content expands on the the "comment" I gave myself ( alluded to in this "answer") But does it answer that core point? I don't see how it does. It's really just a very long comment. – Neil Lunn Mar 10 '15 at 11:40
  • 1
    The OP has trouble to isolate the problem, and this answer answers two of the four questions he asked in total, plus giving advice on how to isolate the root cause to create a new question that is more to the point. It's not always about the question-answer pattern, it's also about helping people. I'd prefer if the OP *simplified* the question and posted a new one, instead of *expanding* the current one, achieving a better fit for the Q&A structure. – mnemosyn Mar 10 '15 at 11:45
  • I think I said that and I think I said that you basically mirrored what I commented for in clarification with a longer version of that.This does not "answer" the core question of "why does the same `_id` value get inserted". If you don't know the answer then just leave a comment. Preferably not just duplicating what someone else has said. – Neil Lunn Mar 10 '15 at 11:47
  • As you can see, I wrote that answer parallel to your second comment. You didn't event *attempt* to explain that the index *key* and the index *name* aren't the same thing, so I hardly just copied your comment. Focus your energy on helping people instead. – mnemosyn Mar 10 '15 at 11:51
  • Thanks for the answer. I have added a comment above with a little bit more info. However at this moment I have no idea about what is going wrong. – Quantum Mar 10 '15 at 13:31
0

So far it's not clear why there is the problem of the duplicate key error. However I tried some things and I found a workable solution.

I moved the insert of the data from client to server side. Therefore I followed the solution in this topic:

How to import data from CSV file into Meteor collection at server side

When the insert function is server side the duplicate key error doesn't appear and everything works perfectly.

Community
  • 1
  • 1
Quantum
  • 143
  • 4
  • 10