2

Suppose to have a collection of MongoDB documents with the following structure:

{
    id_str:     "some_value",
    text:       "some_text",
    some_field: "some_other_value"
}

I would like to filter such documents so as to obtain the ones with distinct text values.

I learned from the MongoDB documentation how to extract unique field values from a collection, using the distinct operation. Thus, by performing the following query:

db.myCollection.distinct("text")

I would obtain an array containing the distinct text values:

["first_distinct_text", "second_distinct_text",...]

However, this is not the result that i would like to obtain. Instead, I would like to have the following:

{ "id_str": "a_sample_of_id_having_first_distinct_text", 
  "text": "first_distinct_text"}
{ "id_str": "a_sample_of_id_having_second_distinct_text", 
  "text": "second_distinct_text"}

I am not sure if this can be done with a single query.

I found a similar question which, however, do not solve fully my problem.

Do you have any hint on how to solve this problem?

Thanks.

Community
  • 1
  • 1
Eleanore
  • 1,750
  • 3
  • 16
  • 33
  • Do you mean documents where `id_str != text != some_field`? – styvane Aug 05 '15 at 08:49
  • No. `some_field` would be ignored. `text` has to be unique. `id_str` should be picked at random from the set of documents reporting the same `text` field. So if I have: `{id_str:123,text:"hello"}` and `{id_str:456,text:hello}`, the result could be either `{id_str:123,text:"hello"}` or `{id_str:456,text:hello}`, since they have the same text and it does not matter whether you select "123" or "456" as `id_str` – Eleanore Aug 05 '15 at 08:50

2 Answers2

2

You should look into making an aggregate query using the $group stage, and probably using the $first operator.

Maybe something along the lines of:

db.myCollection.aggregate([{ $group : { _id : { text: "$text"},
                                        text: { $first: "$id_str" }
                                      }
                           }])
Etienne Membrives
  • 879
  • 1
  • 8
  • 12
0

try:

 db.myCollection.aggregate({$group: {_id: {'text': "$text", 'id_str': '$id_str'}}})

More information here: http://docs.mongodb.org/manual/reference/method/db.collection.aggregate/

viktor.svirskyy
  • 469
  • 2
  • 8
  • I tried. It still gives duplicates on the `text` field, since the `id_str` field contains non-duplicate fields – Eleanore Aug 05 '15 at 08:47