0

I have an array of items, of which I would to find the first matches in my mongo db.

The array is like this, but could be very long: ["item:9802", "item:15051", "item:10028", "item:6575", "item:2355"] What I need is the first three hits, that can be found in the mongo db. Not all IDs will exist.

I use the github.com/mongodb/mongo-ruby-driver. What I did was this

search_hash = {:id=>{"$in"=>["item:9802", "item:15051", "item:10028", "item:6575", "item:2355"]}}
found = coll.find(search_hash).to_a

But I find that this returns the exact same results (and in the same order) when I use a completely different order, like:

search_hash = {:id=>{"$in"=>["item:2355", "item:6575", "item:9802", "item:15051", "item:10028"]}}

It there anyway to get the first matches without having to loop over the array (which is very long) and perform a find on every loop? I'm pretty new to Mongo, so maybe this is a very simple question, but I hope that anyone can help me with this.

Fritzz
  • 656
  • 6
  • 27
  • You should use `coll.find(search_hash).limit(3)`. – Arup Rakshit Jan 07 '15 at 17:16
  • Thanks for your answer, but the problem is that the ordering is not correct. In both cases, it returns ["item:10028", "item:15051", "item:9802"] in that order, whereas I only want the first hits in my array. limit(1) for example should give ["item:9802"] in the first case, and ["item:2355"] in the second. – Fritzz Jan 07 '15 at 17:20
  • what should be your ordering output? – Arup Rakshit Jan 07 '15 at 17:21
  • I want the hits in order of the array. Preferably without looping over the array as it can become quite big, and I only need the first, or first 3. – Fritzz Jan 07 '15 at 17:23
  • Just marked your basic duplicate. That's how you preserve the order that you provided in your `$in` clause. Combine this with [**`$limit`**](http://docs.mongodb.org/manual/reference/operator/aggregation/limit/) in aggregation as part of what was previously mentioned and you have a complete answer. – Neil Lunn Jan 07 '15 at 17:25
  • @NeilLunn, what do you mean by 'Just marked your basic duplicate.' Preserving the order is exactly what I need. How do I do that? – Fritzz Jan 07 '15 at 22:24
  • @Fritzz At that time, your question was marked as a duplicate of [Does MongoDB's $in clause guarantee order?](http://stackoverflow.com/a/22800784/2313887) But appears to have been voted for reopening. Noting that the link has been copied into the answer given but this has nothing to do with that answer. To have the server return the results in order and limit to the first three matches in that list order you would use the "aggregation" approach there and add a `{ "$limit" => 3 }` pipeline stage at the end. – Neil Lunn Jan 07 '15 at 23:21

2 Answers2

0

Your query should be :

search_hash = {id: {$in: ["item:9802", "item:15051", "item:10028", "item:6575", "item:2355"]}}
db.coll.find(search_hash).limit(3).sort({id: 1})

Read this Combine Cursor Methods and Does MongoDB's $in clause guarantee order?.

Community
  • 1
  • 1
Arup Rakshit
  • 116,827
  • 30
  • 260
  • 317
  • This returns all matching documents, in order of id, and only the first 3. – Fritzz Jan 07 '15 at 17:26
  • I want the hits in order of the array, in the order of the array as I supply it. Preferably without looping over the array as it can become quite big, and I only need the first, or first 3. – Fritzz Jan 07 '15 at 17:28
  • Your `in` query gives result as per the DB record.. then order it. – Arup Rakshit Jan 07 '15 at 17:29
0

Thanks for your answers and suggestions. The "aggregation" approach from Does MongoDB's $in clause guarantee order? seems to be the best way to do this in Mongo, but I find that pretty cumbersome. I just want to make sure I get a result back, so I'm sending more IDs to look for, to make sure I find at least something. I dont want be bothered with assigning weights to everything. I've decided to just loop over the items, and break if I find something, like this:

item_ids = params['item_ids'].split('&').uniq.first(30).map {|key| "#{item}:#{key}"}
found = []
item_ids.each do |item_id|
  found = coll.find({id: item_id}).to_a
  break if found.any?
end
if found.any?
  ## The rest of my logic, using the first hit from my array.

For me, this is easiest to read, maintain and understand. If anyone has a better approach which involves only one Mongo query, I'm still very interested.

Community
  • 1
  • 1
Fritzz
  • 656
  • 6
  • 27