3

I'm new here but could use some help. I have a list of data frames, and for each element within my list (i.e., data.frame) I want to quickly paste one column in a data set to multiple other columns in the same data set, separated only by a period (".").

So if I have one set of data in a list of data frames:

list1[[1]]

A  B  C
2  1  5
4  2  2

Then I want the following result:

list1[[1]]

 A    B   C
2.5  1.5  5
4.2  2.2  2  

Where C is pasted to A and B individually. I then want this operation to take place for each data frame in my list.

I have tried the following:

pasteX<-function(df) {for (i in 1:dim(df)[2]-1) {
df[,i]<-as.numeric(sprintf("%s.%s", df[,i], df$C))
}
return(df)}
list2<-lapply(list1, pasteX)

But this approach is verrrry slow for larger matrices and lists. Any recommendations for making this code faster? Thanks!

zeekster26
  • 45
  • 5
  • 1
    Welcome. Ideally, provide your data in an easier to read way: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – tjebo Jul 12 '18 at 20:18
  • 2
    Ah, thank you for sharing that...I'll pay more attention to how I present my data and questions from now on! – zeekster26 Jul 12 '18 at 20:29

3 Answers3

4

Assuming everything is integers < 10

lapply(list1, function(x){
    x[,-3] <- x[,-3] + x[,3]/10
    x})
IceCreamToucan
  • 28,083
  • 2
  • 22
  • 38
3

We can use Map

list1[[1]][-3] <- Map(function(x, y) as.numeric(sprintf('%s.%s', x, y)), 
                     list1[[1]][-3], list1[[1]][3])

If there are many datasets, loop using lapply, convert the first two columns to matrix and paste with the third column, update the output, and return the dataset

lapply(list1, function(x)  {
     x[1:2] <- as.numeric(sprintf('%s.%s', as.matrix(x[1:2]), x[,3]));
     x })
#[[1]]
#    A   B C
#1 2.5 1.5 5
#2 4.2 2.2 2

Or using tidyverse

library(tidyverse)
map(list1, ~ .x %>%
               mutate_at(1:2, funs(as.numeric(sprintf('%s.%s', ., C)))))

Or with data.table

library(data.table)
lapply(list1,  function(x) setDT(x)[, (1:2) := 
     lapply(.SD, function(x) as.numeric(sprintf('%s.%s', x, C))) ,
             .SDcols = 1:2][])
akrun
  • 874,273
  • 37
  • 540
  • 662
0

try this:

df <- data.frame(a = c(1,2,3), b = c(3,2,1), c = c(2,1,1))


pastex <- function(x){
 m<-  sapply(df[,1:2], function(x) as.numeric(paste(x, df$c, sep = '.')))
 m <- as.data.frame(m)
 m <- cbind(m, df["c"])
 return(m)
}

mylist <- list(df1 = df, df2 = df)

lapply(mylist, pastex)