1

I need a function that recognises every x amount of columns as a separate site. So in df1 below there are 8 columns, with 4 sites each consisting of 2 variables. Previously, I have used a procedure like this as answered here Selecting column sequences and creating variables.

set.seed(24)
df1 <- as.data.frame(matrix(sample(0:20, 8*10, replace=TRUE), ncol=8))

I then need to calculate a column sum so that a total for each variable is obtained.

colsums <- as.data.frame(t(colSums(df1)))

I subsequently split the dataframe using this technique...

lst1 <- setNames(lapply(split(1:ncol(colsums), as.numeric(gl(ncol(colsums), 
                                                     2, ncol(colsums)))), function(i) colsums[,i]), paste0('site', 1:4))
list2env(lst1, envir=.GlobalEnv)

And organise into one dataframe...

Combined <- as.matrix(mapply(c,site1,site2,site3,site4))
rownames(Combined) <- c("Site.1","Site.2","Site.3","Site.4")

Whilst this technique has been great on smaller dataframes, where there are a substantial amount of sites (>500) typing out each site following the mapply function takes up a lot of code and could lead to some sites getting missed off if I'm typing them all in manually. Is there an easy way to overcome this following the colsums stage?

Community
  • 1
  • 1
James White
  • 705
  • 2
  • 7
  • 20
  • How are these data frames of "unequal length"? – krlmlr Jul 27 '15 at 11:49
  • If you were interested in just changing the final part, you could use the following: codeString <- paste("as.matrix(mapply(",paste(c("c",names(lst1)),collapse=","),"))",sep="") Combined <- eval(parse(text=codeString)) rownames(Combined) <- names(lst1) – Wannes Rosiers Jul 27 '15 at 11:52
  • Apologies, my initial question held code on merging the dataframes of unequal length, not knowing if going back a stage (prior to df1) may lead to simpler coding. I decided to leave this part out in the end for simplicity but I forgot to change the question title so thank you for bringing it to my attention. @WannesRosiers this too was a great comment and also worked so thank you! – James White Jul 27 '15 at 12:05

1 Answers1

2

A matrix is a vector with dimensions. Matrices are stored in column-major order in R.

The call matrix(colsums, nrow=2) should help you a lot.

NB.: Polluting the "global" environment is generally a bad idea.

krlmlr
  • 25,056
  • 14
  • 120
  • 217
  • if I wanted to manipulate the data following this in a dataframe, how is it possible to get it in this format? as.data.frame does not seem to work. – James White Jul 27 '15 at 14:33