4

I have two huge (few hundred thousands of records) collections Col1 and Col2, and I need to fetch joined data from both of them. There is a join criteria that lets me dramatically decrease number of records returned to few hundreds, so in SQL language I would run something like

 SELECT ... FROM Col1 INNER JOIN Col2 ON Col1.field1 = Col2.field2

and it would run pretty fast as Col1.field1 and Col2.field2 are indexed fields. Is there any direct way or workaround to do the same thing fast in MongoDb with indexes usage, not to scan all the items?

Note: I cannot redesign collections to merge them into one.

John Saunders
  • 160,644
  • 26
  • 247
  • 397
YMC
  • 4,925
  • 7
  • 53
  • 83
  • we have a similar situation. Did you find a way to improve join performance between 2 x collections that have an indexed foreign key? – Ka Tech Sep 13 '21 at 10:05

3 Answers3

5

MongoDB has no JOIN so there is not a fast equivalent. It is most likely a schema design issue but you said you can't change that. You can't query multiple collections in one query.

You can either do the join client-side in 2 queries or you can do it in non-live style by doing a map-reduce and generating a 3rd collection.

Reference this other question for details on how to do a map-reduce

Community
  • 1
  • 1
jdi
  • 90,542
  • 19
  • 167
  • 203
  • Both ways you mentioned suppose iteration through all half a million records of at least one of collection which is performance killer, and I need to do it in run-time, not somewhere in background. I do not think it's a schema design issue, it's rather issue of selected database type, MongoDb seems do not fit well our application requirements – YMC Sep 24 '12 at 04:36
  • Thats is what I meant. Its a schema design issue for the database selected. Ideally you would have stored this under a single collection. Its not an efficient situation for this DB – jdi Sep 24 '12 at 04:48
  • The thing is that storing this information into one collection might increase database size hundreds times as there are hundreds ways to join this 2 collections, keeping them separate means data denormalization, merging them together means excessive size of database and many repeated data. Even if I can redesign schema I would not do it. So my conclusion is that MongoDb is just not flexible enough as relational databases are. Thanks anyway for the response – YMC Sep 24 '12 at 17:03
  • @YMC why on earth would you need to run a real time select of half a million rows? What user would be able to display all of that data? I think you need to rethink what you are doing. – Snowburnt Nov 04 '13 at 23:07
3

In order to join in MongoDb 4.2 you can use aggregation and $lookup like this query:

db.collection.aggregate([
   { $lookup: { from: "...", ... } }
])

It is usefull for me

More information: https://docs.mongodb.com/manual/reference/operator/aggregation/lookup/

Yadi
  • 31
  • 4
0

the join in MongoDB is so expensive. 2 solutions:

  • Redesign merge them into one
  • limit, match before you join
0xcuonghx
  • 450
  • 5
  • 4