0

The data structure I'm working with consists of an object which has many embedded objects in MongoDB. The catch is that an object may contain 2 or more of the same embedded object (they have the same ID). Using map/reduce, I'd like to get an aggregate count that only counts an embedded object once within an object rather than every single occurrence. Any help will be greatly appreciated. See code samples below:

//working map function that counts every occurance of an embedded object
function(){
  if(this.embeddedObjects != undefined){
    this.embeddedObjects.forEach(function(e){
      emit(e['_id'].toString(), 1);
    });
  }
}

//non-working map function for counting 1 occurance of an embedded object per object
function(){
  if(this.embeddedObjects != undefined){
    var embeddedIds = new Array();
    this.embeddedObjects.forEach(function(e){
      if(embeddedIds.join(',').indexOf(e['_id'].toString()) != -1){
        embeddedIds.push(e['_id'].toString());
        emit(e['_id'].toString(), 1);
      }
    });
  }
}

// reduce function
function(key,values){
  var count = 0;
  values.forEach(function(v){
    count += v;
  });
  return count;
}
tchrist
  • 78,834
  • 30
  • 123
  • 180
Mitch Kett
  • 66
  • 2

1 Answers1

1

One option is to store unique ids during the reduce phase and use the finalizer to count the number of unique ids. Please see here for an example.

Or

If you just want to count unique ids and the path to the embedded field is fixed, I believe you should be able to use the distinct command, which is much simpler to use.

Community
  • 1
  • 1
Ren
  • 678
  • 4
  • 6