0

I write a mapreduce for mongodb, but the result have some problems

the date:

mongos> db.perGoods.find()

{ "_id" : ObjectId("514bf6428f43a9fee9cef526"), "id" : 1, "goods_id" : "1234", "keywords" : [   {   "keyword" : "lianyiqun",    "price" : 3.52 },   {   "keyword" : "nvzhuang",     "price" : 4.27 },   {   "keyword" : "chunkuan",     "price" : 3.12 },   {   "keyword" : "chaoliu",  "price" : 8.32 },   {   "keyword" : "duanzhuang",   "price" : 4.92 } ] }
{ "_id" : ObjectId("514bf65d8f43a9fee9cef527"), "id" : 2, "goods_id" : "5678", "keywords" : [   {   "keyword" : "lianyiqun",    "price" : 9.26 },   {   "keyword" : "nvzhuang",     "price" : 4.52 } ] }
{ "_id" : ObjectId("514bf6768f43a9fee9cef528"), "id" : 3, "goods_id" : "5612", "keywords" : [   {   "keyword" : "lianyiqun",    "price" : 7.42 },   {   "keyword" : "nvzhuang",     "price" : 6.52 } ] }
{ "_id" : ObjectId("514bf6968f43a9fee9cef529"), "id" : 4, "goods_id" : "9612", "keywords" : [   {   "keyword" : "lianyiqun",    "price" : 3.12 },   {   "keyword" : "nvzhuang",     "price" : 6.57 },   {   "keyword" : "chunzhuang",   "price" : 5.55 } ] }

the map function:

mongos> var mapFunction = function() {
...                        for (var index = 0; index < this.keywords.length; index++) {
...                            var key = this.goods_id;
...                            var value = {
...                                          count: 1,
...                                          price: this.keywords[index].price
...                                        };
...                            emit(key, value);
...                        }
...                     };

the reduce function:

mongos> var reduceFunction = function(key, priceCountObjects) {
...                           reducedValue = { count: 0, sumprice: 0 };
... 
...                           for (var index = 0; index < priceCountObjects.length; index++) {
...                               reducedValue.count += priceCountObjects[index].count;
...                               reducedValue.sumprice += priceCountObjects[index].price;
...                           }
... 
...                           return reducedValue;
...                       };

the code:

mongos> db.perGoods.mapReduce(
...                      mapFunction,
...                      reduceFunction,
...                      { out: "map_reduce_test" }
...                    )
{
    "result" : "map_reduce_test",
    "timeMillis" : 5,
    "counts" : {
        "input" : 4,
        "emit" : 12,
        "reduce" : 4,
        "output" : 4
    },
    "ok" : 1,
}

the result:

mongos> db.map_reduce_test.find()
{ "_id" : "1234", "value" : { "count" : 5, "sumprice" : 24.15 } }
{ "_id" : "5612", "value" : { "count" : 2, "sumprice" : 13.94 } }
{ "_id" : "5678", "value" : { "count" : 2, "sumprice" : 13.78 } }
{ "_id" : "9612", "value" : { "count" : 3, "sumprice" : 15.240000000000002 } }
mongos> 

Why the last result is 15.240000000000002?

  • 1
    Is the result wrong? `3.12+6.57+5.55 == 15.240000000000002 // true`. Or is this question about why it has so many decimal places? – beatgammit Mar 22 '13 at 07:02
  • Related: http://stackoverflow.com/questions/11150741/javascript-add-operation-returns-bad-result/11150751#11150751 – beatgammit Mar 22 '13 at 07:09

1 Answers1

0

Since map/reduce in MongoDB is based on JavaScript, it uses JavaScript's number arithmetic, which is essentially a double precision IEEE 754 floating point arithmetic.

Floating point numbers are (essentially, for details refer to the wikipedia article) stored as

mantissa * base ^ exponent

In this case, base is 2 (base 10 floating point numbers are called decimal), and some numbers, including rational numbers, simply cannot be represented exactly in this format if the values have limited precision.

As it currently stands, your best bet is probably to perform a rounding at the end of your operation. Unfortunately, not all currencies have two decimal places so internationalization might become a pain, and there's still a risk that the rounding becomes significant - to be sure, make sure you know What Every Computer Scientist Should Know About Floating-Point Arithmetic

You can also vote for this feature in 10gen's Jira.

mnemosyn
  • 45,391
  • 6
  • 76
  • 82