2

OK, so, I have two collections. One of them has 6M documents, the other one has 25M documents. I want them in a new collection:

Collection 1 example:

Movie

 {"movieId" : "1", "name" : "Titanic"},
 {"movieId" : "2", "name" : "King Kong"}

Collection 2 example:

Character

{"characterId": "1", "movieId": "1", "characterName": "Name 1"},
{"characterId": "2", "movieId": "1", "characterName": "Name 2"},
{"characterId": "3", "movieId": "1", "characterName": "Name 3"}

So I want a new collection like:

{
  "movieId" : "1", 
  "name" : "Titanic", 
  "characters":[ *collection 2 here* ]
},
{
  "movieId" : "2", 
  "name" : "King Kong", 
  "characters":[]
}

I tried:

db.Movie.aggregate([{ $lookup: { from: "Character",localField: "movieId", foreignField: "movieId", as: "characters" }},{ $out : "movie_characters" }])

But it never ends :( (by never I mean like 10 hours later it was still thinking) If I execute it without the $out it show results in 10 minutes maybe.

Am I doing something wrong with $out?

Thanks for any advice.

andrea
  • 23
  • 3

2 Answers2

2

The major thing you leaving here is an index.

$lookup when matching the foreign field, MongoDB uses the index.

So, create an index in the Character collection using:

db.Character.createIndex({ "movieId": 1 })

And then apply the lookup aggregation

db.Movie.aggregate([
  {
    $lookup: {
      from: "Character",
      localField: "movieId", 
      foreignField: "movieId", 
      as: "characters" 
    }
  },
  { 
    $out : "movie_characters" 
  }
])

Hope this will help :)

Vijay Rajpurohit
  • 1,266
  • 2
  • 13
  • 25
1

The problem not just in a single index but in performance also. I highly recommend you to use Cursors for heavy queries like yours. (Check this answer for more info)

When you are using this query you also "flooding" your RAM with all the documents. So using cursor will definitely helps you.

Also, try to use Mongo Compass for testing your queries before launch it on production via Explain mode, which gives your all what youneed to know, especially how long your query takes to execute, what indexes it using, and much more.

AlexZeDim
  • 3,520
  • 2
  • 28
  • 64
  • 1
    Thanks, Vijay Rajpurohit solved it, but I'm reading the doc you send for futures references. Have a nice day :) and thanks for your time – andrea May 01 '20 at 15:37