37

In the MongoDB aggregation framework, I was hoping to use the $unwind operator on an object (ie. a JSON collection). Doesn't look like this is possible, is there a workaround? Are there plans to implement this?

For example, take the article collection from the aggregation documentation . Suppose there is an additional field "ratings" that is a map from user -> rating. Could you calculate the average rating for each user?

Other than this, I'm quite pleased with the aggregation framework.

Update: here's a simplified version of my JSON collection per request. I'm storing genomic data. I can't really make genotypes an array, because the most common lookup is to get the genotype for a random person.

variants: [

    {
        name: 'variant1', 
        genotypes: {

            person1: 2,
            person2: 5,
            person3: 7,

        }
    }, 

    {
        name: 'variant2', 
        genotypes: {

            person1: 3,
            person2: 3,
            person3: 2,

        }
    }

]
Neil Lunn
  • 148,042
  • 36
  • 346
  • 317
Brett Thomas
  • 1,694
  • 3
  • 15
  • 21
  • Given this is your collection(?) what is the aggregation you want to perform that you need $unwind for? – Asya Kamsky Jul 05 '12 at 19:15
  • is it a safe assumption that the number of "persons" in genotypes subdocument is variable? I'm not clear why you can't make genotypes an array, btw. – Asya Kamsky Jul 05 '12 at 19:16
  • Why does using an array as the datatype for genotypes preclude you from getting a random person out of it? – Tyler Brock Jul 06 '12 at 18:44
  • Currently the `$unwind` operator only works with arrays. Please be very specific about what you are trying to do, there may be another way to achieve the same result. – Tyler Brock Jul 06 '12 at 18:53

4 Answers4

33

It is not possible to do the type of computation you are describing with the aggregation framework - and it's not because there is no $unwind method for non-arrays. Even if the person:value objects were documents in an array, $unwind would not help.

The "group by" functionality (whether in MongoDB or in any relational database) is done on the value of a field or column. We group by value of field and sum/average/etc based on the value of another field.

Simple example is a variant of what you suggest, ratings field added to the example article collection, but not as a map from user to rating but as an array like this:

{ title : title of article", ...
  ratings: [
         { voter: "user1", score: 5 },
         { voter: "user2", score: 8 },
         { voter: "user3", score: 7 }
  ]
}

Now you can aggregate this with:

[ {$unwind: "$ratings"},
  {$group : {_id : "$ratings.voter", averageScore: {$avg:"$ratings.score"} } } 
]

But this example structured as you describe it would look like this:

{ title : title of article", ...
  ratings: {
         user1: 5,
         user2: 8,
         user3: 7
  }
}

or even this:

{ title : title of article", ...
  ratings: [
         { user1: 5 },
         { user2: 8 },
         { user3: 7 }
  ]
}

Even if you could $unwind this, there is nothing to aggregate on here. Unless you know the complete list of all possible keys (users) you cannot do much with this. [*]

An analogous relational DB schema to what you have would be:

CREATE TABLE T (
   user1: integer,
   user2: integer,
   user3: integer
   ...
);

That's not what would be done, instead we would do this:

CREATE TABLE T (
   username: varchar(32),
   score: integer
);

and now we aggregate using SQL:

select username, avg(score) from T group by username;

There is an enhancement request for MongoDB that may allow you to do this in the aggregation framework in the future - the ability to project values to keys to vice versa. Meanwhile, there is always map/reduce.

[*] There is a complicated way to do this if you know all unique keys (you can find all unique keys with a method similar to this) but if you know all the keys you may as well just run a sequence of queries of the form db.articles.find({"ratings.user1":{$exists:true}},{_id:0,"ratings.user1":1}) for each userX which will return all their ratings and you can sum and average them simply enough rather than do a very complex projection the aggregation framework would require.

Community
  • 1
  • 1
Asya Kamsky
  • 41,784
  • 5
  • 109
  • 133
  • Sadly I think this is true...I'm in a similar situation and the reason for having the structure like this is for simple increments elsewhere. However, while the aggregation framework can't handle this, MongoDB can still group and aggregate these fields. You just need to do it with MongoCode in a $group. You'll basically loop and say if the key does not exist, set it to the current value. If it does, add to the value. Pretty simple. It is unfortunate that the aggregation framework can't handle something so simple. Actually, using the aggregation framework leads to more code maintenance. – Tom Oct 25 '12 at 17:06
  • You can vote on this https://jira.mongodb.org/browse/SERVER-5947 to get a feature in mongo that may make it possible. You should also change your schema to use "name":"username","value":num array elements and that will *not* prevent you from being able to use $inc operator since you can match array element and update it with $ "positional operator" – Asya Kamsky Oct 28 '12 at 16:59
  • Post it as a question - comments are for discussion of this question and its answers only. – Asya Kamsky Nov 22 '17 at 18:13
15

Since 3.4.4, you can transform object to array using $objectToArray

See: https://docs.mongodb.com/manual/reference/operator/aggregation/objectToArray/

Adrian
  • 9,102
  • 4
  • 40
  • 35
1

This is an old question, but I've run across a tidbit of information through trial and error that people may find useful.

It's actually possible to unwind on a dummy value by fooling the parser this way:

db.Opportunity.aggregate(
  { $project: {
        Field1: 1, Field2: 1, Field3: 1,
        DummyUnwindField: { $ifNull: [null, [1.0]] }
    }
  },
  { $unwind: "$DummyUnwindField" }
);

This will produce 1 row per document, regardless of whether or not the value exists. You may be able tinker with this to generate the results you want. I had hoped to combine this with multiple $unwinds to (sort of like emit() in map/reduce), but alas, the last $unwind wins or they combine as an intersection rather than union which makes it impossible to achieve the results I was looking for. I am sadly disappointed with the aggregate framework functionality as it doesn't fit the one use case I was hoping to use it for (and seems strangely like a lot of the questions on StackOverflow in this area are asking) - ordering results based on match rate. Improving the poor map reduce performance would have made this entire feature unnecessary.

saarp
  • 1,931
  • 1
  • 15
  • 28
0

This is what I found & extended.

Lets create experimental database in mongo

db.copyDatabase('livedb' , 'experimentdb')

Now Use experimentdb & convert Array to object in your experimentcollection

db.getCollection('experimentcollection').find({}).forEach(function(e){
    if(e.store){
        e.ratings = [e.ratings]; //Objects name to be converted to array eg:ratings
        db.experimentcollection.save(e);
    }
})

Some nerdy js code to convert json to flat object

var flatArray = [];

var data = db.experimentcollection.find().toArray();

for (var index = 0; index < data.length; index++) {

  var flatObject = {};

  for (var prop in data[index]) {

    var value = data[index][prop];

    if (Array.isArray(value) && prop === 'ratings') {
      for (var i = 0; i < value.length; i++) {
        for (var inProp in value[i]) {
          flatObject[inProp] = value[i][inProp];
        }
      }
    }else{
        flatObject[prop] = value;
    }
  }
  flatArray.push(flatObject);
}

printjson(flatArray);
vijay
  • 10,276
  • 11
  • 64
  • 79