Looking to create a report on a set of data which shows how densely / sparesely populated the data set is.
Specificially, i'd like to
Iterate through each _ID (there are about 250 within the dataset so manually crafting them would be very time consuming)
For each_ID produce a count of NULL / NOT NULL values
output the above like so:
CUstomer_ID 350000 0 Customer_Name 150000 200000 Customer_DOB 200000 150000
Any hints / tips on how to do this? I've searched for a while but iterating through all of the _ID keys seems to be possible via mapreduce:
MongoDB Get names of all keys in collection
and then used this:
mr = db.runCommand({
"mapreduce" : "mpfs",
"map" : function() {
for (var key in this) { emit(key, null); }
},
"reduce" : function(key, stuff) { return null; },
"out": "mpfs" + "_keys"
})
but then i'm not sure how to use the mapreduce set to feed another query that would produce the results.
I'm also unsure as to whether this is the most efficient way of doing this as i'm from an SQL background...is there a more efficient way?