i have a large mongodb collection with a lot of duplicate inserts like this
{ "_id" : 1, "val" : "222222", "val2" : "37"}
{ "_id" : 2, "val" : "222222", "val2" : "37" }
{ "_id" : 3, "val" : "222222", "val2" : "37" }
{ "_id" : 4, "val" : "333333", "val2" : "66" }
{ "_id" : 5, "val" : "111111", "val2" : "22" }
{ "_id" : 6, "val" : "111111", "val2" : "22" }
{ "_id" : 7, "val" : "111111", "val2" : "22" }
{ "_id" : 8, "val" : "111111", "val2" : "22" }
i want to count all duplicates for each insert and only leave one unique entry with the count number in DB like this
{ "_id" : 1, "val" : "222222", "val2" : "37", "count" : "3"}
{ "_id" : 2, "val" : "333333", "val2" : "66", "count" : "1"}
{ "_id" : 2, "val" : "111111", "val2" : "22", "count" : "4" }
i already checked out MapReduce and aggregation framework but they never output the full document back and only do one calculation for full collection
it would be good to save the new data to a new collection