1

I'm curious about how to encrypt R functions in a package so that once the package is built, the function can be called but the algorithm behind cannot be found. The algorithm is written in R.

Suppose we create a Function in Rcpp and call it with an exported Rcpp function wrapper. The source code is a string to create an expression to be evaluated to be a Function.

#include <Rcpp.h>
using namespace Rcpp;

ExpressionVector secret_expr("function(x) x + 1");
Function secret_fun = secret_expr.eval();

//' Secret function
//' @param x a numeric vector
//' @export
// [[Rcpp::export]]
SEXP SF(SEXP x)
{
  return secret_fun(x);
}

Once the package that contains the C++ code above is built to a binary, it seems that we don't have a way to see the body of secret_fun. However, the function body still can be revealed by calling strings package.so | grep function in shell where package.so is the binary of the built package.

This means that in order to hide the function body, we cannot write it directly in C++ code. I found https://stackoverflow.com/a/1360175/2906900 quite interesting so I test it with the example above:

#include "HideString.h"
#include <Rcpp.h>
using namespace Rcpp;

DEFINE_HIDDEN_STRING(Secret, 0x2f, ('f')('u')('n')('c')('t')('i')('o')('n')('(')('x')(')')('x')('+')('1'))

ExpressionVector secret_expr(GetSecret());
Function secret_fun = secret_expr.eval();

//' Secret function
//' @param x a numeric vector
//' @export
// [[Rcpp::export]]
SEXP SF(SEXP x)
{
  return secret_fun(x);
}

This time I cannot find any meaningful functions via strings and the function works exactly the same. Therefore, it seems quite possible to encrypt sensitive algorithms in this way. But I'm wondering is there an easy way to crack this?

Kun Ren
  • 4,715
  • 3
  • 35
  • 50
  • 4
    I presume you understand what the _Open_ in Open Source stands for? – Dirk Eddelbuettel Sep 07 '17 at 12:54
  • Just curious if it's technically possible. A friend asks and wants to keep proprietary algorithms and models in private. – Kun Ren Sep 07 '17 at 13:00
  • Understood. Our day job's are in the same or similar industries AFAIK. But there are things to do with R, and questions that can be meaningfully asked about here. This one strikes me as being on the other side of the fence. – Dirk Eddelbuettel Sep 07 '17 at 13:01
  • 5
    Don't call this encryption, its obfuscation. Anyone can run your executable under a debugger and trivially extract the source string. The problem with any obfuscation like this is that the R interpreter needs a plain text version of the super sekrit function, so at some point the program has to de-obfuscate it. – Spacedman Sep 07 '17 at 13:40
  • Your friend needs to balance the cost of working out hot to keep the algorithms secret against the probability of losing them times the cost of losing them. – Spacedman Sep 07 '17 at 13:42
  • @Spacedman yes that's exactly why I believe it's technically impossible to hide anything with scripting languages. – Kun Ren Sep 07 '17 at 13:44
  • 2
    Its technically impossible to hide anything with any language, unless you have serious control over the hardware, and even then hackers will scrape the ceramic from your crypto chip and read the pins to get the decryption keys (as was done on a popular game console). Run secret stuff on a secure server, called from an insecure client via an API. – Spacedman Sep 07 '17 at 13:47
  • Have a look at [this discussion](https://stackoverflow.com/q/25283022/1655567) on a more generic subject of obfuscating R scripts and also [this discussion](https://stackoverflow.com/a/19226817/1655567) on looking at function bodies. – Konrad Sep 07 '17 at 13:49

0 Answers0