0

I have code that creates a new dataframe, df2 which is a copy of an existing dataframe, df but with four new columns a,b,c,d. The values of these columns are given by their own functions.

The code below works as intended but it seems repetitive. Is there a more succinct form that you would recommend?

df2 <- df %>% mutate(a = lapply(df[,c("value")], f_a), 
                     b = lapply(df[,c("value")], f_b), 
                     c = lapply(df[,c("value")], f_c), 
                     d = lapply(df[,c("value")], f_d)
)

Example of cell contents in "value" column "-0.57(-0.88 to -0.26)". I am applying a function to extract first number:

f_a <- function(x){
    substring(x, 1, regexpr("\\(", x)[1] - 1)
}

This works fine when applied to a single string (-0.57 from the example). In the data frame I found that lapply gives correct values based on input from any cell in the "value" column. The code seems a bit repetitive but works.

sassora
  • 53
  • 5
  • Please add the functions, current and expected output. See what makes a great R question [here](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – NelsonGon Aug 25 '19 at 15:18
  • 1
    You are `lapply`ing to just one column. Not an answer to the question but it is likely to be the same as `f_a(df[, "value"])`, etc. As for the question, maybe `?mutate_at`. – Rui Barradas Aug 25 '19 at 15:21
  • Thanks for the suggestions, these do make the code shorter. Before using lapply, I had trouble creating a column using functions. When applied to one input value they work perfectly but across all rows it gave wrong values. I reasoned that lapply was applying the function to each cell in the "value" column individually. When I remove lapply, although the code is shorter, R is doing something I don't understand. – sassora Aug 25 '19 at 16:20

1 Answers1

2

We can use map

library(tidyverse)
df[c('a', 'b', 'c', d')] <- map(list(f_a, f_b, f_c, f_d), ~  lapply(df$value, .x)) 

Note: Without the functions or an example, not clear whether this is the optimal solution. Also, as noted in the comments, many of the functions can be applied directly on the column instead of looping through each element.

akrun
  • 874,273
  • 37
  • 540
  • 662