22

Yep, I'm a SQL jockey (sorta) coming into the CouchDb Map/Reduce world. I thought I had figured out how the equivalent of the COUNT(*) SQL aggregator function for CouchDB datasets with the following:

Map:

function(doc) {
  emit(doc.name, doc);
}

Reduce:

function(keys, values, rereduce){
  return values.length;
}

Which I thought worked, returning something like:

"super fun C"   2
"super fun D"   2
"super fun E"   2
"super fun F"   18

... but not really. When I add a record, this count varies wildly. Sometimes the count actually decreases, which was very surprising. Am I doing something wrong? Maybe I don't fully understand the concept of eventual consistency?

Brad Gessler
  • 2,744
  • 4
  • 23
  • 22

2 Answers2

44

In your reduce just put:

_count

You can also get a sum using:

_sum

so basically reduce: "_sum" or reduce: "_count" and make sure the value your map emits is a valid integer (numeric value)

See "Built in reduce functions".

LyteFM
  • 895
  • 8
  • 12
David Coallier
  • 441
  • 4
  • 2
28

It looks like your reduce results are being re-reduced. That is, reduce is called more than once for each key and then called again with those results. You can handle that with a reduce function like this:

function(keys, values, rereduce) {
  if (rereduce) {
    return sum(values);
  } else {
    return values.length;
  }
}

Alternatively, you can change the map function so that the values are always a count of documents:

// map
function(doc) {
  emit(doc.name, 1);
}

// reduce
function(keys, values, rereduce) {
  return sum(values);
}
Will Harris
  • 21,597
  • 12
  • 64
  • 64
  • 2
    Using javascript reduce functions instead of the built in ones will give you very bad performance. See David's answer – wallacer Jun 09 '14 at 22:38