20

I have a collection containing entries in following format:

{ 
    "_id" : ObjectId("5538e75c3cea103b25ff94a3"), 
    "userID" : "USER001", 
    "userName" : "manish", 
    "collegeIDs" : [
        "COL_HARY",
        "COL_MARY",
        "COL_JOHNS",
        "COL_CAS",
        "COL_JAMES",
        "COL_MARY",
        "COL_MARY",
        "COL_JOHNS"
    ]
}

I need to find out the collegeIDs those are repeating. So the result should give "COL_MARY","COL_JOHNS" and if possible the repeating count. Please do give a mongo query to find it.

JohnnyHK
  • 305,182
  • 66
  • 621
  • 471
lime_pal
  • 203
  • 1
  • 2
  • 10
  • 1
    possible duplicate of [How to remove duplicate entries from an array?](http://stackoverflow.com/questions/9862255/how-to-remove-duplicate-entries-from-an-array) – Thomas Sep 10 '15 at 13:09
  • 2
    Please, search for other similar questions before posting your own. I found [this](http://stackoverflow.com/questions/9862255/how-to-remove-duplicate-entries-from-an-array) through Googling "mongodb find duplicate values array" in under a minute. There are plenty of resources out there to help you with this. Attempt also to show us what you have done. That way we can better guide you. – Thomas Sep 10 '15 at 13:11

1 Answers1

25

Probably there would be many of these documents and thus you want it per ObjectId.

db.myCollection.aggregate([
  {"$project": {"collegeIDs":1}},
  {"$unwind":"$collegeIDs"},
  {"$group": {"_id":{"_id":"$_id", "cid":"$collegeIDs"}, "count":{"$sum":1}}},
  {"$match": {"count":{"$gt":1}}},
  {"$group": {"_id": "$_id._id", "collegeIDs":{"$addToSet":"$_id.cid"}}}
])

This might be what you want to, not clear from your question:

db.myCollection.aggregate([
  {"$match": {"userID":"USER001"}},
  {"$project": {"collegeIDs":1, "_id":0}},
  {"$unwind":"$collegeIDs"},
  {"$group": {"_id":"$collegeIDs", "count":{"$sum":1}}},
  {"$match": {"count":{"$gt":1}}},
])
Cetin Basoz
  • 22,495
  • 3
  • 31
  • 39