0

I'm trying to optimize a sorting function in R using C++ through the Rcpp package. The function that I'm trying to mimic is sort(x, decreasing = TRUE, na.last = TRUE). I've been able to implement the first argument decreasing = TRUE just fine with the following code:

library(Rcpp)
x <- c(3, 7, 21, 6, NA, 1, 8, 7)


# base R sorting
sort(x, decreasing = TRUE, na.last = TRUE)
# [1] 21  8  7  7  6  3  1 NA

# Simple Rcpp sorting function
cppFunction("NumericVector sort_cpp(NumericVector x, bool decreasing = false) {
  NumericVector sorted = clone(x).sort(decreasing);
  return sorted;
}")
sort_cpp(x, decreasing = TRUE)
# [1] NA 21  8  7  7  6  3  1

I'm not well versed in C++ or the Rcpp package enough to be able to place NA values last in all cases. When I set decreasing to FALSE, it will place the NA values at the end by default, but when it gets reversed (set to TRUE), it flips the whole vector around and the NA values go at the front of the vector. I've tried a couple of things with limited success (again, very novice in C++), and was able to get something running in a C++ shell but I haven't been able to successfully code it into R. This is mostly copied and pasted from this question:

#include <iostream>
#include <algorithm>
#include <math.h>

using namespace std;

int main()
{
int n, k = 4, j; // k is number of elements
double x = -0.0;
double i = 0;
double swap = 0;//used in the function as a place holder and used for swapping between other variables
double a[100] = { (1/x) + (1/i), 2.3, 1/x *0, 1/i };//array of double elements // 1/i * 0 is NaN
//(1 / i) * 0


for (n = 0; n < (k - 1); n++) // for loop consists of variables and statements in order to arrange contents of array
{
  for (j = 0; j < k - n - 1; j++)
    {
      if (!std::isnan(a[j + 1]) && std::isnan(a[j]) || (a[j] > a[j + 1]))
        {
        swap = a[j];
        a[j] = a[j + 1];
        a[j + 1] = swap;

        }
    }
}

cout << "The list of sorted elements within the array, is: " << endl; /* Output message to user */
for (int i = 0; i < k; i++)// Loop up to number of elements within the array
{
    cout << a[i] << " ";/* Output contents of array */
}
cout << endl; //new line

return 0; 
}

Is there a way to get this into R using the Rcpp package, or equivalent?

  • Cannot tell you anything about Rcpp, unfortunately. But in the c++ version you might replace your loops with a call to the `std::sort` function, that might be supplied with a custom comparison function. That might look like this: ``` std::sort( std::begin(a), std::end(a), [](double d0,double d1) { if( isnan(d0) ) return false; if( isnan(d1) ) return true; return d0 < d1; } ); ``` (Pls note that I changed your array to a std::vector for this. You can see your modified example [here](https://godbolt.org/z/7vzz3bcvq). – kaba Mar 22 '21 at 22:04
  • Thanks for the example! I've tried incorporating it into Rcpp with some luck but I'm still having trouble sorting in decreasing order. Here is what I have: – mcoghill Mar 22 '21 at 23:30
  • `// [[Rcpp::plugins(cpp11)]] #include #include #include #include using namespace Rcpp; using namespace std; // [[Rcpp::export]] NumericVector sort_cpp(NumericVector x, bool decreasing = false) { const int k = x.size(); NumericVector y = clone(x).sort(decreasing); std::sort(std::begin(y), std::end(y), [](double d0, double d1) { if( isnan(d0) ) return false; if( isnan(d1) ) return true; return d0 < d1; }); return y; }` – mcoghill Mar 22 '21 at 23:31
  • Sorry that didn't format properly... – mcoghill Mar 22 '21 at 23:32

1 Answers1

0

I figured it out with help from kaba's suggestion! It just needed a change in the sign used to determine the order of sorting. Here's what I have been able to use in Rcpp to generate an R equivalent sorting algorithm. By the way, I haven't tested it yet to determine performance compared to base R so there could be a better way of achieving the same results. First, save a .cpp file to be sourced:

// [[Rcpp::plugins(cpp11)]]
#include <Rcpp.h>
#include <iostream>
#include <algorithm>
#include <math.h>
using namespace Rcpp;
using namespace std;

// [[Rcpp::export]]
NumericVector sort_cpp(NumericVector x, bool decreasing = false, bool nalast = false) {
  NumericVector y = clone(x);
  if(nalast) {
    if(decreasing) {
      std::sort(std::begin(y), std::end(y),
                [](double d0, double d1) {
                  if( isnan(d0) ) return false;
                  if( isnan(d1) ) return true;
                  return d0 > d1;
                });
    } else {
      std::sort(std::begin(y), std::end(y),
                [](double d0, double d1) {
                  if( isnan(d0) ) return false;
                  if( isnan(d1) ) return true;
                  return d0 < d1;
                });
    }
  } else {
    y.sort(decreasing);
  }
  return y;
}

Then use Rcpp::sourceCpp() to compile the function for use. I haven't incorporated the cases where na.last = FALSE, where NA values would be removed. That was beyond the scope of my use, but this should help someone else get started should this question be stumbled upon in the future!