4

I am trying to perform a sum in a group by operation in rmongodb. The "column" that I am summing contains NaN values with cause my sum to return NaN. I want to use $project to replace the NaNs with zeros before running the group by command but I am stuck.

This will produce dummy data (in the mongo console):

db.NAtest.insert({ y : 1, x : 1})
db.NAtest.insert({ y : 1, x : 2})
db.NAtest.insert({ y : 2, x : 3})
db.NAtest.insert({ y : 2, x : 4})
db.NAtest.insert({ y : 2, x : NaN})

This will produce the desired result (in the mongo console):

db.NAtest.aggregate( { $project : { y : 1, 
                                    x : { $cond : [ { $ne : ['$x', NaN] }, 
                                                    '$x', 
                                                    0] } } } )

This is what I have working in R:

library(rmongodb)
mongo <- mongo.create()
buf <- mongo.bson.buffer.create()
mongo.bson.buffer.append(buf, "aggregate", "NAtest");
mongo.bson.buffer.start.array(buf, "pipeline")
mongo.bson.buffer.append.bson(buf, 
                              "0", 
                              mongo.bson.from.list(list('$project' = list(y = 1, 
                                                                         x= 1))))
mongo.bson.buffer.finish.object(buf)
cmd <- mongo.bson.from.buffer(buf)

result <- mongo.command(mongo, "rmdb", cmd)

When I try to add the $cond to transform the NaNs, using the following:

x = list("$cond" = c(list("$ne" = c("$x", as.numeric(NA))), "$x", "0"))

instead of

x = 1

No results are returned and mongo.get.err() returns the value 10 indicating that the BSON is invalid.

If I run print(cmd) in the R session when the $cond portion is included, the following output is produced:

> print(cmd)
    aggregate : 2    NAtest
    pipeline : 4     
            0 : 3    
                    $project : 3     
                            y : 1    1.000000
                            x : 3    
                                    $cond : 3        
                                            $ne : 4          
                                                    0 : 2    $x
                                                    1 : 2    NA

                                             : 2     $x
                                             : 2     0

I do not think that the problem lies with NaN / as.numeric(NA) part of the code since when I try to replace say the number 2 with zero, I get the same error.

Thanks in advance.

  • Having the same problem with this query: query<-mongo.bson.from.JSON(' { "$group": { "_id":"$created", "count": { "$sum": 1} } }') – Daniel Cerecedo Jul 01 '14 at 20:21

1 Answers1

1

It seems rmongodb cannot really handle operators embedded in other operator when aggregating. I faced the same issue when using $substr in $group.

Try only one operator in a stage. If it is not possible, I can recommend RMongo package as an alternative.

dbAggregate(
    mongo, "db", '{
        $project : { 
            y : 1, 
            x : {$cond : [{ $ne : ['$x', NaN] }, '$x', 0]}
        }
}')
benka
  • 4,732
  • 35
  • 47
  • 58
AnotherNewbie
  • 198
  • 1
  • 1
  • 6