0

I am new to MongoDB. I have written a JS query that I am running in the mongo shell.

Basically, I have two tables. Both have almost 160,000 records.

I am iteating over first table and for every record, going to second table to find if corresponding record exists there.

pbp = db.poss_pbp.find().batchSize(1000)


while(pbp.hasNext()){

  pbp_temp = pbp.next();
  id = (pbp_temp["poss_idx"]);

  opt_temp = db.poss_opt.find({"poss_idx": id}).count()

  if(opt_temp == 0)
  {
    //Do something
  }
}

The query is running extremely slow (around 4-5 minutesfor every 1000 records). What can I do to make it work faster? Key "poss_idx" has an index in database.

mihsathe
  • 8,904
  • 12
  • 38
  • 54
  • 2
    You could be causing (the equivalence of) a scan. If you only care about existence, you might want to check http://stackoverflow.com/questions/8389811/how-to-query-mongodb-to-test-if-an-item-exists out. They seem to suggest `Count` is a rather slow operation. – Mitch Jun 11 '13 at 22:13
  • 2
    If it's "only" 160,000 records, you might want to just pull back all the `poss_idx` identifiers at once (just grab the IDs) and then compare them locally in memory. – WiredPrairie Jun 12 '13 at 00:43

1 Answers1

1

I believe that there is a problem with the index. I have two similar tables: 200,000 records and about 500,000. A similar request is performed for about 40 seconds with an index and a very long time without an index.

Run the query:

 db.poss_opt.find({poss_idx: "some_id"}).explain()

If the above query could not use an index, you will see:

 {
     "cursor": "BasicCursor",
     "nscannedObjects": 532543,
     "nscanned": 532543,
     "millis": 712,
     "indexBounds": {},
 }

Otherwise:

 {
     "cursor": "BtreeCursor poss_idx_1",
     "nscannedObjects": 0,
     "nscanned": 0,
     "millis": 0,
     "indexBounds": {"poss_idx": [["some_id", "some_id"]]},
 }

To view index information for the collection, use db.poss_opt.stats() and db.poss_opt.getIndexes()

If the problem is with the index, try to drop and create new one:

 db.poss_opt.dropIndex({poss_idx: 1})
 db.poss_opt.ensureIndex({poss_idx: 1})

If you have any questions, please feel free to ask.

Jarandinor
  • 1,846
  • 15
  • 8