-1

I'm new to MongoDB and this is my first use of MapReduce ever.

I have two collections: Shops and Products with the following schema

Products
{'_id', 'type': 'first', 'enabled': 1, 'shop': $SHOP_ID  }
{'_id', 'type': 'second', 'enabled': 0, 'shop': $SHOP_ID  }
{'_id', 'type': 'second', 'enabled': 1, 'shop': $SHOP_ID  }

And

Shops
{'_id', 'name':'L', ... }
{'_id', 'name':'M', ... }

I'm looking for a GROUPBY similar statement for MongoDB with MapReduce to retrieve the Shops with name 'L' that have Products with 'enabled' => 1

How can I do it? Thank you.

Sergio Tulentsev
  • 226,338
  • 43
  • 373
  • 367
Alexandru R
  • 8,560
  • 16
  • 64
  • 98
  • 2
    Do you need to join those collections together frequently? If so, you might want to reconsider using MongoDb or rethink your schema a bit. – Tim Gautier Apr 18 '12 at 20:55
  • Hi there, did you have any luck with this? The answer from Marc looks quite good, and I think deserves feedback of some kind. – halfer Oct 11 '15 at 21:56
  • actually we moved away to a sql friendly database. Mongo is good for other operations – Alexandru R Oct 12 '15 at 15:14
  • OK. Please consider upvoting each of the below answers, in that case - you received kind assistance from two people, and did not find the time to respond to either. – halfer Oct 24 '15 at 16:16

2 Answers2

1

It should be possible to retrieve the desired information without a Map Reduce operation.

You could first query the "Products" collection for documents that match {'enabled': 1}, and then take the list of $SHOP_IDs from that query (which I imagine correspond to the _id values in the "Shops" collection), put them in an array, and perform an $in query on the "Shops" collection, combined with the query on "name".

For example, given the two collections:

> db.products.find()
{ "_id" : 1, "type" : "first", "enabled" : 1, "shop" : 3 }
{ "_id" : 2, "type" : "second", "enabled" : 0, "shop" : 4 }
{ "_id" : 3, "type" : "second", "enabled" : 1, "shop" : 5 }
> db.shops.find()
{ "_id" : 3, "name" : "L" }
{ "_id" : 4, "name" : "L" }
{ "_id" : 5, "name" : "M" }
> 

First find all of the documents that match {"enabled" : 1}

> db.products.find({"enabled" : 1})
{ "_id" : 1, "type" : "first", "enabled" : 1, "shop" : 3 }
{ "_id" : 3, "type" : "second", "enabled" : 1, "shop" : 5 }

From the above query, generate a list of _ids:

> var c = db.products.find({"enabled" : 1})
> shop_ids = []
[ ]
> c.forEach(function(doc){shop_ids.push(doc.shop)})
> shop_ids
[ 3, 5 ]

Finally, query the shops collection for documents with _id values in the shop_ids array that also match {name:"L"}.

> db.shops.find({_id:{$in:shop_ids}, name:"L"})
{ "_id" : 3, "name" : "L" }
> 

Similar questions regarding doing the equivalent of a join operation with Mongo have been asked before. This question provides some links which may provide you with additional guidance:
How to join MongoDB collections in Python?

If you would like to experiment with Map Reduce, here is a link to a blog post from a user who used an incremental Map Reduce operation to combine values from two collections.
http://tebros.com/2011/07/using-mongodb-mapreduce-to-join-2-collections/

Hopefully the above will allow you to retrieve the desired information from your collections.

Community
  • 1
  • 1
Marc
  • 5,488
  • 29
  • 18
0

Short answer: you can't do that (with a single MapReduce command).

Long answer: MapReduce jobs in MongoDB run only on a single collection and cannot refer other collections in the process. So, JOIN/GROUP BY-like behaviour of SQL is not available here. The new Aggregation Framework also operates on a single collection only.

I propose a two-part solution:

  1. Get all shops with name "L".

  2. Compose and run map-reduce command that will check every product document against this pre-computed list of shops.

Sergio Tulentsev
  • 226,338
  • 43
  • 373
  • 367