1

I have a dataframe (DF) with many columns (colA, colB, colC, colD, . . . . ). I would like to apply the na.approx function, with group_by, to several, but not all, columns in the dataframe. I succeeded in applying the na.approx and group_by functions on one column with the following:

DFxna<-DF %>% group_by(colA) %>% mutate(colB = na.approx(colB, na.rm = FALSE, maxgap=4))

However, I was not able to create a code that would apply to several, specified, columns. I thought that lapply would be appropriate, and tried several times, unsuccesfully, to use lapply.

Stephen
  • 473
  • 4
  • 11
  • 1
    It would be easier to help you if you provide [a minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) including a snippet of your data or some fake data. This said: Have your tried with `across`, e.g. `mutate(across(c(colB, colC), ~ na.approx(.x, na.rm = FALSE, maxgap=4)))`? – stefan Apr 27 '22 at 20:50
  • But that code does not utilze "group_by", and I need to have the results grouped by colA. – Stephen Apr 27 '22 at 20:59

1 Answers1

2

Maybe this fits your need. As I mentioned in my comment one option would be to use dplyr::across.

Using some fake data:

library(zoo)
library(dplyr)

DF <- data.frame(
  colA = c(1, 1, 1, 2, 2, 2, 2),
  colB = c(1, NA, 3, 5, NA, NA, 6),
  colC = c(1, NA, 2, 8, NA, 9, 6),
  colD = c(1, NA, 3, 5, NA, NA, 6)
)

DF %>% 
  group_by(colA) %>% 
  mutate(across(c(colB, colC), ~ na.approx(.x, na.rm = FALSE, maxgap=4)))
#> # A tibble: 7 × 4
#> # Groups:   colA [2]
#>    colA  colB  colC  colD
#>   <dbl> <dbl> <dbl> <dbl>
#> 1     1  1      1       1
#> 2     1  2      1.5    NA
#> 3     1  3      2       3
#> 4     2  5      8       5
#> 5     2  5.33   8.5    NA
#> 6     2  5.67   9      NA
#> 7     2  6      6       6
stefan
  • 90,330
  • 6
  • 25
  • 51