2
library(data.table)
library(lubridate)

x1 <- c(20090101, "2009-01-02", "2009 01 03", "2009-1-4",
       "2009-1, 5", "Created on 2009 1 6", "200901 !!! 07")

dt2 <- data.table(id = c(1,1,1,2,2,2,2), date1 = ymd(x1), charval = c("aa","vv","ss","a","b","c","d"))

   id      date1 charval
1:  1 2009-01-01      aa
2:  1 2009-01-02      vv
3:  1 2009-01-03      ss
4:  2 2009-01-04       a
5:  2 2009-01-05       b
6:  2 2009-01-06       c
7:  2 2009-01-07       d

I use next code for grouping by id:

dt3 <- dt2[, Map(function(x,y) ifelse(x != "paste", get(x)(y, na.rm = TRUE), paste(y, sep = ";")), 
                              setNames(c("mean", "paste"), names(.SD)), .SD), by = id]

to get something like this:

   id      date1 charval
1:  1 2009-01-02      aa;vv;ss
2:  2 2009-01-05      a;b;c;d

but in real I see next result:

   id date1 charval
1:  1    NA      aa
2:  2    NA       a

1) I dont understand why paste doesnt work 2) I dont understand why mean(date1) doesnt work because for example next code works fine:

mean(dt2$date1)
[1] "2009-01-04"
evgenii ershenko
  • 489
  • 1
  • 5
  • 14

1 Answers1

1

It is not clear why we have to go through Map and get. After grouping by 'id', get the mean of 'date1' and paste the 'charval' together

dt2[, .(date1 = mean(date1), charval = toString(charval)), id]
#    id      date1    charval
#1:  1 2009-01-02 aa, vv, ss
#2:  2 2009-01-05 a, b, c, d

Note: toString is paste(..., collapse=', ')

dt2[, .(date1 = mean(date1), charval = paste(charval, collapse=";")), id]
#   id      date1  charval
#1:  1 2009-01-02 aa;vv;ss
#2:  2 2009-01-05  a;b;c;d

As the OP's question is about Map with using get to call the mean. This seems to be triggering the

if (!is.numeric(x) && !is.complex(x) && !is.logical(x)) { warning("argument is not numeric or logical: returning NA") return(NA_real_)

and returns the NA when it finds that 'date1' is of class Date although it is stored as numeric. One option is to specify the envir in get

Another problem is the use of ifelse. It is better to use if/else as there are only two elements

dt2[, Map(function(x, y)  if(x != "paste") get(x, envir = parent.frame())(y, na.rm = TRUE) 
  else paste(y, collapse=':'), setNames(c("mean", "paste"), names(.SD)), .SD), by = id]
#    id      date1  charval
#1:  1 2009-01-02 aa:vv:ss
#2:  2 2009-01-05  a:b:c:d

get is kind of tricky and if specify the correct environment, it works as expected

get("mean")(dt2$date1)
#[1] "2009-01-04"

Or instead of if/else to the "paste" string, we can check on the column class and if it is character then do the paste or else return mean

dt2[, Map(function(x, y)  if(is.character(y)) get(x)(y, collapse=":") 
     else get(x, envir = parent.frame())(y, na.rm = TRUE),
     setNames(c("mean", "paste"), names(.SD)), .SD), by = id]
#   id      date1  charval
#1:  1 2009-01-02 aa:vv:ss
#2:  2 2009-01-05  a:b:c:d

Note that it is better to use the first approach without any hassles

akrun
  • 874,273
  • 37
  • 540
  • 662