-2

Question:
Is there a clean and fast-running way to compute geometric mean using data.table?

Background:
So I am using this:

my_col_list <- names(mydata)[ my_start:ncol(mydata)]
my_name_list<- paste0(my_col_list, "_", "arithmean")
mydata[, (my_name_list) := unlist(lapply(.SD,
                                      function(x) rollapply(x,
                                                            5,
                                                            mean,
                                                            na.pad = TRUE)), 
                               recursive = F),
       .SDcols = my_col_list]

But I want to compute geometric mean. I am working on ~20 million rows and ~1500 columns, so a fast-runing built-in is preferred.

I saw these (link) ways of computing geometric mean, but they are hand-coded so they are going to be slower. This (link) is about an overall geometric mean, not a windowed/rolling geometric mean.

Packages that have hand-coded (slower-running) geometric means include:

not mean, but it is fast:

  • Gmedian, Gmedian, median instead of mean, but built to be faster. Uses Rcpp for compute.

maybe, but not sure:

  • rotations, mean.SO3
EngrStudent
  • 1,924
  • 31
  • 46
  • `rollapply(x, 5, geometric.mean, na.pad = TRUE)` change to this – BENY Jul 20 '17 at 16:07
  • 1
    You should post a reproducible example of data along with necessary library() calls so that your code runs. – Frank Jul 20 '17 at 16:09

1 Answers1

3

Just using your own code with package psych function geometric.mean

mydata[, (my_name_list) := unlist(lapply(.SD,
                                      function(x) rollapply(x,
                                                            5,
                                                            geometric.mean,
                                                            na.pad = TRUE)),recursive = F),
       .SDcols = my_col_list]
BENY
  • 317,841
  • 20
  • 164
  • 234
  • The psych version is hand-coded. It is "exp(mean(log(x), na.rm=TRUE))". If you type "edit(geometric.mean)" after "psych" is loaded then it shows you the code. This is not actually a built-in/compiled version. – EngrStudent Jul 20 '17 at 16:40
  • 1
    @EngrStudent If you do want something much quicker , I prefer `Rcpp` and writing you own function by`C++` – BENY Jul 20 '17 at 16:46
  • I haven't used Rcpp yet. Can you give me a link to a "starter example" that can get me moving? – EngrStudent Jul 20 '17 at 17:03
  • 1
    See this link different question , but Rccp performance better https://stackoverflow.com/questions/45045318/efficient-calculation-of-var-covar-matrix-in-r – BENY Jul 20 '17 at 17:28
  • 1
    Rollapply has deprecated na.pad. Instead use "fill=NA". – EngrStudent Oct 23 '19 at 14:37