0

Looking to create a report on a set of data which shows how densely / sparesely populated the data set is.

Specificially, i'd like to

  1. Iterate through each _ID (there are about 250 within the dataset so manually crafting them would be very time consuming)

  2. For each_ID produce a count of NULL / NOT NULL values

  3. output the above like so:

CUstomer_ID 350000 0 Customer_Name 150000 200000 Customer_DOB 200000 150000

Any hints / tips on how to do this? I've searched for a while but iterating through all of the _ID keys seems to be possible via mapreduce:

MongoDB Get names of all keys in collection

and then used this:

mr = db.runCommand({
  "mapreduce" : "mpfs",
  "map" : function() {
    for (var key in this) { emit(key, null); }
  },
  "reduce" : function(key, stuff) { return null; }, 
  "out": "mpfs" + "_keys"
})

but then i'm not sure how to use the mapreduce set to feed another query that would produce the results.

I'm also unsure as to whether this is the most efficient way of doing this as i'm from an SQL background...is there a more efficient way?

Community
  • 1
  • 1
mr_gooding
  • 29
  • 5
  • You don't provide enough detail for an answer, but you'd typically use an [aggregation pipeline](http://docs.mongodb.org/manual/core/aggregation-pipeline/) to do this. – JohnnyHK Jan 12 '15 at 14:01
  • Hi JohnnyHK - thanks for the reply. I've got the keys into a result set, but am still unsure of how to use this within an aggregation pipeline - db[mr.result].distinct("_id") contians my result set but I can't seem to figure the syntax to run something like: db.mpfs.aggregate( { $group : {_id : db[mr.result].distinct("_id"), total: { $exists: true }} } ); – mr_gooding Jan 13 '15 at 11:18
  • We need more detail to help. What do the documents look like? Is the document structure known/specified? – wdberkeley Jan 14 '15 at 15:25
  • Hi Wdberkely - the structure isn't known / specified unfortunately, there are a set of .csv files which have some overlap and some differences. My initial results set was collected by: – mr_gooding Jan 14 '15 at 16:19

0 Answers0