3

Is there a Rcpp sugar for %in%?

For eg, I have the following statement in R

y <- c('XA','XB','XC','XF','XK','XL','XM','XN','XO','XP','XS','XU','XW','XY', 'DF','DS','AS','XL','FG')
x <- ifelse(y %in% c("XA","XB","XC","XF","XK","XL","XM","XN","XO","XP","XS","XU","XW","XY"),"KCA","KUS")

I am trying to use || in Rcpp for the above, where both x and y have been defined to be the type

std::vector<std::string> 

The code snippet is

int n = y.size();
for (int i = 0; i < n; i++){
 if (y[i] == 'XA' ||
      y[i] == 'XB' ||
      y[i] == 'XC' ||
      y[i] == 'XF' ||
      y[i] == 'XK' ||
      y[i] == 'XL' ||
      y[i] == 'XM' ||
      y[i] == 'XN'||
      y[i] == 'XO'||
      y[i] == 'XP' ||
      y[i] == 'XS' ||
      y[i] == 'XU'  ||
      y[i] == 'XW' ||
      y[i] == 'XY' ) {x[i] = 'KCA';}
  else
  {x[i] ='KUS';}
} //end of loop

But I get the following error:

ambiguous overload for operator'=='(operand types are 'std::basic_string<char>' and 'int')

Is there a sugar for

%in%

that I can use in Rcpp, or how do I use || in Rcpp here to avoid the error?

Gompu
  • 415
  • 1
  • 6
  • 21
  • 1
    I think the C++ way to do `%in%` is `std::find()`, see _e.g._ https://stackoverflow.com/questions/571394/how-to-find-out-if-an-item-is-present-in-a-stdvector – neilfws Mar 05 '18 at 21:06
  • 5
    Also, welcome to C++ and strings. You used a single `'`, what you need for strings are `"`. After that your `==` comparison operator code should work. – Dirk Eddelbuettel Mar 05 '18 at 21:13
  • Yeah, it worked after replacing `'` with `"`, but why is that so? Does it mean I cannot use `'` in `std::vector` ? – Gompu Mar 05 '18 at 21:24
  • 3
    Language definition going back to C. `'` is used for a single character, which also casts to integer hence the error you saw. Strings required `"`. – Dirk Eddelbuettel Mar 05 '18 at 21:37
  • To truly replicate `%in%`, one needs to look at the source code. Typing `%in%` (with single quotes around it) reveals that `%in%` is itself sugar for `match`. Here is the workhorse function for [match](https://github.com/wch/r-source/blob/e690b0d6998dfbc360f0fa14492eb8648df20949/src/main/unique.c#L880). – Joseph Wood Mar 06 '18 at 02:05

1 Answers1

5

Check out the Unofficial Rcpp API for an example of the sugarized in operator. In particular, the example given there is:

CharacterVector A = CharacterVector::create("a", "b", "c", "c", "e", "b", "d");

CharacterVector B = CharacterVector::create("a", "b");

LogicalVector C = in(A, B);

In your case we could construct:

#include <Rcpp.h>

// [[Rcpp::export]]
Rcpp::CharacterVector my_classify(Rcpp::CharacterVector x,
                                  Rcpp::CharacterVector table,
                                  std::string true_cond = "KCA",
                                  std::string false_cond = "KUS") {

  Rcpp::CharacterVector out = Rcpp::CharacterVector(x.size());

  Rcpp::LogicalVector cond = in(x, table);

  for(unsigned int i = 0; i < cond.size(); ++i){
     if(cond[i]) {
       out[i] = true_cond;
     } else {
       out[i] = false_cond;
     }
  }

  return out;
}

Test Case

x = c('XA','XB','XC','XF','XK','XL','XM','XN','XO','XP','XS',
      'XU','XW','XY', 'DF','DS','AS','XL','FG')
table = c("XA","XB","XC","XF","XK","XL","XM","XN",
          "XO","XP","XS","XU","XW","XY")
y = my_classify(x, table)
y
#>  [1] "KCA" "KCA" "KCA" "KCA" "KCA" "KCA" "KCA" "KCA" "KCA" "KCA" "KCA"
#> [12] "KCA" "KCA" "KCA" "KUS" "KUS" "KUS" "KCA" "KUS"
coatless
  • 20,011
  • 13
  • 69
  • 84
  • Old question but this can also be found in the [official documentation](https://dirk.eddelbuettel.com/code/rcpp/html/namespaceRcpp.html#a5425325b7bba84db55d55ce9c9e3fefd). – Oliver Mar 15 '21 at 12:18
  • The doxygen documentation is lacking in examples and is a pain to navigate. Not sure I would link out there. – coatless Mar 15 '21 at 19:57
  • Definitely agree there. Similarly I find that the examples (often referenced by Dirk et al.) are a pain to look through and requires more reading than what should be necessary. I think this might be a limitation due to the documentation style enforced by R as it is currently. But even then I feel it could be better, especially when there are so many fantastic functions that could make ones life much easier (until one moves onto std) – Oliver Mar 15 '21 at 20:54