In our CouchDB document database, we have documents with different "status" property values like this:
doc1: {status: "available"},
doc2: {status: "reserved"},
doc3: {status: "available"},
doc4: {status: "sold"},
doc5: {status: "available"},
doc6: {status: "destroyed"},
doc7: {status: "sold"}
[...]
Now, I would like to write a map-reduce function that returns all distinct status values that exist over all documents: ["available", "reserved", "sold", "destroyed"]
.
My approach was to begin writing a map function that returns only the "status" property of each document:
function (doc) {
if(doc.status) {
emit(doc._id, doc.status);
}
}
And now, I would like to compare all map rows to each other such that no status duplicates will be returned.
The official CouchDB documentation seems to be very detailed and technical, but cannot really be projected to our use case, which does not have any nested structures like in blog posts but simply "flat objects" with a "status" property. Besides, our backend uses PouchDB as an adapter to connect to our remote CouchDB.
I discovered that when executing the reduce function below (which I implemented myself trying to understand what happens under the hood), some strange result will be returned.
function(keys, values, rereduce) {
var array = [];
if(rereduce) {
return values;
} else {
if(array.indexOf(values[0]) === -1) {
array.push(values[0]);
}
}
return array;
}
Result:
{
"rows": [
{
"key": null,
"value": "[reduce] [status] available,available,[status] sold,unknown,[status] available,[status] available,[status] available,reserved,available,[status] reserved,available,[status] available,[status] sold,reserved,[status] sold,sold,[status] available,available,[status] reserved,[status] reserved,[status] available,[status] reserved,available"
}
]
}
The reduce
step seems to be executed exactly once, while the status
loops sometimes have only a single value, then two or three values, without a recognizable logic or pattern.
Could somebody please explain to me the following:
- How to retrieve an array with all distinct status values
- What is the logic (or workflow) of the
reduce
function of CouchDB? Why dostatus
rows have an arbitrary number of status values?