4

Similar to map/reduce but in reverse. Does mongo have a way of reformatting data. I have a collection in the following format.

{ 
  {"token-id" : "LKJ8_lkjsd"
    "data": [
               {"views":100, "Date": "2015-01-01"},
               {"views":200, "Date": "2015-01-02"},
               {"views":300, "Date": "2015-01-03"},
               {"views":300, "Date": "2015-01-03"}
            ]
  }
}

I would like to process the entire collection into a new format. where every time series data point is its document mapped to the ID hopefully using some inherent mongo functionality similar to map reduce. If there isn't; I'd appreciate a strategy in which we can do this.

{
  { "token-id" : "LKJ8_lkjsd", "views": 100, "Date" : "2015-01-01"},
  { "token-id" : "LKJ8_lkjsd", "views": 200, "Date" : "2015-01-01"},
  { "token-id" : "LKJ8_lkjsd", "views": 300, "Date" : "2015-01-01"}
}
Dap
  • 2,309
  • 5
  • 32
  • 44
  • Maybe this helps: http://stackoverflow.com/questions/13281733/is-it-possible-to-flatten-mongodb-result-query – mvw Sep 25 '15 at 17:42

3 Answers3

3

You need the $unwind from the aggregation pipeline, see mongodb documentation

In your case the code would be

db.yourcollection.aggregate( [ { $unwind : "$data" } ] )

unwind does not insert documents to the new collection by itself

You can use

> db.test.aggregate( [ { $unwind : "$data" }, {$project: {_id:0, "token-id":1, "data":1}}, {$out: "another"} ] )
> db.another.find()

In the first line you need to suppress _id, because after the $unwind you get 4 documents with the same _id (and thus they cannot be inserted) Without the explicit _id, new values will be generated automatically

Here is the output that I got for your example

{ "_id" : ObjectId("560599b1699289a5b754fab9"), "token-id" : "LKJ8_lkjsd", "data" : { "views" : 100, "Date" : "2015-01-01" } }
{ "_id" : ObjectId("560599b1699289a5b754faba"), "token-id" : "LKJ8_lkjsd", "data" : { "views" : 200, "Date" : "2015-01-02" } }
{ "_id" : ObjectId("560599b1699289a5b754fabb"), "token-id" : "LKJ8_lkjsd", "data" : { "views" : 300, "Date" : "2015-01-03" } }
{ "_id" : ObjectId("560599b1699289a5b754fabc"), "token-id" : "LKJ8_lkjsd", "data" : { "views" : 300, "Date" : "2015-01-03" } }
lanenok
  • 2,699
  • 17
  • 24
  • this has the functionality but yields a single document result as opposed to a new collection. in addition, the size allocated by mongo for this operation is only 16mb. My database is at 10 gbs – Dap Sep 25 '15 at 18:17
3

The aggregate command can return results as a cursor or store the results in a collection, which are not subject to the size limit. The db.collection.aggregate() returns a cursor and can return result sets of any size.

 var result = db.test.aggregate( [ { $unwind : "$data" }, {$project: {_id:0, "token-id":1, "data":1}}])

    for(result.hasNext()){
     db.collection.insert(result.next());
    }
Rohit Jain
  • 2,052
  • 17
  • 19
3

As per your question with large data set then $unwind creates slow performance the query for this case you should use $map in aggregation to process the array of data like below :

db.collection.aggregate({
"$project": {
    "result": {
        "$map": {
            "input": "$data",
            "as": "el",
            "in": {
                "token-id": "$token-id",
                "views": "$$el.views",
                "Date": "$$el.Date"
            }
        }
    }
 }
}).pretty()
Neo-coder
  • 7,715
  • 4
  • 33
  • 52
  • 1
    unfortunately, this only remaps each item and does not create addiotinal documents per data record – Dap Sep 28 '15 at 15:12
  • @Dap I don't understand what you expecting output, but may be you should `unwind` `result` then you will get separate documents. – Neo-coder Sep 28 '15 at 16:53
  • I appreciate the response. yes an unwind without limits is what i am looking for with exception to its 16mb limit – Dap Sep 28 '15 at 17:02
  • 1
    @Dap see that you got an answer but still problem that you unwind data and iterate over it aging to insert in new collection so is better way to just unwind and use [$out](http://docs.mongodb.org/manual/reference/operator/aggregation/out/) in aggregation – Neo-coder Sep 28 '15 at 17:31