R looping through multiple dataframes in a list

Question

Attempting to calculate differences between every two values in a row then sum the total differences for each dataframe in a list. I know for/while loops in R absolutely suck. I had this working before, but I've broken it. Can someone suggest how to optimize this using an alternative in the apply family? Current code:

 for (i in 1:length(refdata)) { #for each dataframe in a list
    refdif <- as.data.frame(matrix(0, ncol = 1, nrow = nrow(refdata[[i]])))
    refdif1 <- c()
    for (z in 1:ncol(refdata[[i]])) { #for each column in a dataframe
        for(x in 1:nrow(refdata[[i]])) { #for each row in a dataframe
            refdif <- (refdata[[i]][x,z] - refdata[[i]][x,z+1]) #difference of first value + the enxt
            refdif1[x,1] <- (refdif1[x,1] + refidf) #sum of latest difference
        }
    }
    print(refdif1) #where I can conduct tests on each individual dataframe with a column of sums of differences
}

example data: list 1 refdata[[1]]

$`1`
     var1 var2 var3 var4 
  1   1     2    3    4
  2   5     6    7    8

$`2`
     var1 var2 var3 var4 
  1   1     2    3    4
  2   5     6    7    8

var 1 + 2 has the difference calculated, var 3 and 4 has the difference calculated, then each difference is summed together and placed in a new dataframe in a single column. (5-6) + (7-8), (1-2) + (3-4), etc etc:

$`1`
     dif  
  1   -2
  2   -2


$`2`
     dif  
  1   -2
  2   -2

this would strongly benefit from a sample data list. – MichaelChirico Jan 04 '16 at 20:44 — MichaelChirico, Jan 04 '16 at 20:44

score 2 · Accepted Answer · edited May 23 '17 at 11:50

2

One way to do it (per unlisted dataframe) could be by using logical vectors for indexing - their values are recycled - that way calculating the difference between every other column and finally summing the resulting df row-wise.

refdata1<-rowSums(refdata[c(T,F)]-refdata[c(F,T)])

Edit

Exact output can be obtained by

lapply(refdata, function(df){ data.frame(dif=rowSums(df[c(T,F)]-df[c(F,T)])) })

thx Heroka

edited May 23 '17 at 11:50

Community

1
1

answered Jan 04 '16 at 21:13

mtoto

23,919
4
58
71

1

Are you sure you need colSums? – Heroka Jan 04 '16 at 21:17
The exact output could be obtained by `lapply(refdata, function(df){ data.frame(dif=rowSums(df[c(T,F)]-df[c(F,T)])) })` – Heroka Jan 04 '16 at 21:24
yes :) you mind if I put it into the answer as an update? – mtoto Jan 04 '16 at 21:27
No, that's why I put it in a comment (and deleted my answer, as yours is so much more elegant) – Heroka Jan 04 '16 at 21:32
Thanks guys. Works well. – JJL Jan 05 '16 at 16:15
@Heroka When working with two lists is it possible to iterate over both with mapply to do a comparison between dataframes at the same iteration? Each list has the same number of dataframes and each correlates to the other. I am trying this: mapply(refdata1, refdata2, function(x,y) { x(dif=rowSums(df[c(T,F)]-df[c(F,T)])); y(dif=rowSums(df[c(T,F)]-df[c(F,T)])) }) – JJL Jan 05 '16 at 17:00
@JeffLynch please consider making a new question, and not changing your old one/asking a new one in the comments. – Heroka Jan 05 '16 at 17:04

Rick · Answer 2 · 2016-01-04T21:49:24.217

1

# Create test data
x <- rbind(1:4, 5:8)
refdata <- list(x,x)

# Calculate results (all elements should have an even number of columns)
lapply(refdata,  FUN = function(x) x %*% rep_len(c(1, -1), NCOL(x)))

edited Jan 04 '16 at 21:49

answered Jan 04 '16 at 21:36

Rick

888
8
10

R looping through multiple dataframes in a list

2 Answers2