1

Attempting to calculate differences between every two values in a row then sum the total differences for each dataframe in a list. I know for/while loops in R absolutely suck. I had this working before, but I've broken it. Can someone suggest how to optimize this using an alternative in the apply family? Current code:

 for (i in 1:length(refdata)) { #for each dataframe in a list
    refdif <- as.data.frame(matrix(0, ncol = 1, nrow = nrow(refdata[[i]])))
    refdif1 <- c()
    for (z in 1:ncol(refdata[[i]])) { #for each column in a dataframe
        for(x in 1:nrow(refdata[[i]])) { #for each row in a dataframe
            refdif <- (refdata[[i]][x,z] - refdata[[i]][x,z+1]) #difference of first value + the enxt
            refdif1[x,1] <- (refdif1[x,1] + refidf) #sum of latest difference
        }
    }
    print(refdif1) #where I can conduct tests on each individual dataframe with a column of sums of differences
}

example data: list 1 refdata[[1]]

$`1`
     var1 var2 var3 var4 
  1   1     2    3    4
  2   5     6    7    8

$`2`
     var1 var2 var3 var4 
  1   1     2    3    4
  2   5     6    7    8

var 1 + 2 has the difference calculated, var 3 and 4 has the difference calculated, then each difference is summed together and placed in a new dataframe in a single column. (5-6) + (7-8), (1-2) + (3-4), etc etc:

$`1`
     dif  
  1   -2
  2   -2


$`2`
     dif  
  1   -2
  2   -2
Jaap
  • 81,064
  • 34
  • 182
  • 193
JJL
  • 342
  • 2
  • 17

2 Answers2

2

One way to do it (per unlisted dataframe) could be by using logical vectors for indexing - their values are recycled - that way calculating the difference between every other column and finally summing the resulting df row-wise.

refdata1<-rowSums(refdata[c(T,F)]-refdata[c(F,T)])

Edit

Exact output can be obtained by

lapply(refdata, function(df){ data.frame(dif=rowSums(df[c(T,F)]-df[c(F,T)])) })

thx Heroka

Community
  • 1
  • 1
mtoto
  • 23,919
  • 4
  • 58
  • 71
  • 1
    Are you sure you need colSums? – Heroka Jan 04 '16 at 21:17
  • The exact output could be obtained by `lapply(refdata, function(df){ data.frame(dif=rowSums(df[c(T,F)]-df[c(F,T)])) })` – Heroka Jan 04 '16 at 21:24
  • yes :) you mind if I put it into the answer as an update? – mtoto Jan 04 '16 at 21:27
  • No, that's why I put it in a comment (and deleted my answer, as yours is so much more elegant) – Heroka Jan 04 '16 at 21:32
  • Thanks guys. Works well. – JJL Jan 05 '16 at 16:15
  • @Heroka When working with two lists is it possible to iterate over both with mapply to do a comparison between dataframes at the same iteration? Each list has the same number of dataframes and each correlates to the other. I am trying this: mapply(refdata1, refdata2, function(x,y) { x(dif=rowSums(df[c(T,F)]-df[c(F,T)])); y(dif=rowSums(df[c(T,F)]-df[c(F,T)])) }) – JJL Jan 05 '16 at 17:00
  • @JeffLynch please consider making a new question, and not changing your old one/asking a new one in the comments. – Heroka Jan 05 '16 at 17:04
1
# Create test data
x <- rbind(1:4, 5:8)
refdata <- list(x,x)

# Calculate results (all elements should have an even number of columns)
lapply(refdata,  FUN = function(x) x %*% rep_len(c(1, -1), NCOL(x)))
Rick
  • 888
  • 8
  • 10