2

For special values like NA or NaN, boost::unordered_map creates a new key each time I use insert.

// [[Rcpp::depends(BH)]]
#include <boost/unordered_map.hpp>
#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
void test_unordered_map(NumericVector vec) {

  boost::unordered_map<double, int> mymap;
  int n = vec.size();
  for (int i = 0; i < n; i++) {
    mymap.insert(std::make_pair(vec[i], i));
  }

  boost::unordered_map<double, int>::iterator it = mymap.begin(), end = mymap.end();
  while (it != end) {
    Rcout << it->first << "\t";
    it++;
  }
  Rcout << std::endl;
}

/*** R
x <- c(sample(10, 100, TRUE), rep(NA, 5), NaN) + 0
test_unordered_map(x)
*/

Result:

> x <- c(sample(10, 100, TRUE), rep(NA, 5), NaN)

> test_unordered_map(x)
nan nan nan nan nan nan 4   10  9   5   7   6   2   3   1   8   

How do I create only one key for NA and one for NaN?

Vadim Kotov
  • 8,084
  • 8
  • 48
  • 62
F. Privé
  • 11,423
  • 2
  • 27
  • 78
  • You'll have to give it some other value but how I'm not sure. [this is the problem](https://stackoverflow.com/questions/10034149/why-is-nan-not-equal-to-nan) – NathanOliver Aug 23 '18 at 12:45
  • See [this](https://stackoverflow.com/questions/1565164/what-is-the-rationale-for-all-comparisons-returning-false-for-ieee754-nan-values) for some insight... – Dan Mašek Aug 23 '18 at 12:46
  • `NaN` is not equal to itself in c++. Since keys match when they compare equal, you can never match a key whose value is `NaN`. – François Andrieux Aug 23 '18 at 12:49
  • 1
    It is not a good idea to use double as key of map anyway besides this problem – Slava Aug 23 '18 at 12:50
  • 1
    I find the concept of using a `double` as a key a bit strange. Not just because it breaks down for `NaN` values, but also because it may behave unexpectedly if you need to calculate a key, due to floating point rounding error. – François Andrieux Aug 23 '18 at 12:50
  • @FrançoisAndrieux Using double as key ain't that bad in regular map. But true, in hash map comparing for equality is quite used and double is not the best for this one. – bartop Aug 23 '18 at 12:54
  • 3
    Florian you are mixing _R extensions_ with standard IEEE behaviour. R has NaN, NA and NULL, C++ only has NaN. I think this is a thinko on your part. You need to map these differently. – Dirk Eddelbuettel Aug 23 '18 at 13:04

2 Answers2

6

bartop's idea of using a custom comperator is good, although the particular form did not work for me. So I used Boost's documentation as starting point. Combined with suitable functions from R I get:

// [[Rcpp::depends(BH)]]
#include <boost/unordered_map.hpp>
#include <Rcpp.h>
using namespace Rcpp;

struct R_equal_to : std::binary_function<double, double, bool> {
  bool operator()(double x, double y) const {
    return (R_IsNA(x) && R_IsNA(y)) ||
      (R_IsNaN(x) && R_IsNaN(y)) ||
      (x == y);
  }
};

// [[Rcpp::export]]
void test_unordered_map(NumericVector vec) {

  boost::unordered_map<double, int, boost::hash<double>, R_equal_to> mymap;  
  int n = vec.size();
  for (int i = 0; i < n; i++) {
    mymap.insert(std::make_pair(vec[i], i));
  }

  boost::unordered_map<double, int>::iterator it = mymap.begin(), end = mymap.end();
  while (it != end) {
    Rcout << it->first << "\t";
    it++;
  }
  Rcout << std::endl;
}

/*** R
x <- c(sample(10, 100, TRUE), rep(NA, 5), NaN) + 0
test_unordered_map(x)
*/

Result:

> x <- c(sample(10, 100, TRUE), rep(NA, 5), NaN) + 0

> test_unordered_map(x)
7   2   nan nan 4   6   9   5   10  8   1   3   

As desired, NA and NaN are inserted only once. However, one cannot differentiate between them in this output, since R's NA is just a special form of an IEEE NaN.

Ralf Stubner
  • 26,263
  • 3
  • 40
  • 75
5

According to the IEEE standard, NaN values compared with == to anything yeilds always false. So, You just cannot do it this way. You can provide Your own comparator for unordered_map using this std::isnan function.

auto comparator = [](auto val1, auto val2) {
    return std::isnan(val1) && std::isnan(val2) || val1 == val2;
}
boost::unordered_map<double, int, boost::hash<double>, decltype(comparator)> mymap(comparator);
bartop
  • 9,971
  • 1
  • 23
  • 54