0

I have to calculate aggregate of 50 variables(categorical) with respect to 1 variable(a) after finding out, I want to create 50 sheets in a single excel workbook where each sheet will look like:- "variable name" a 1 10 2 21 3 18 . .

and so on.

I`m able to create 50 csv and then merge them into 1 workbook but this is long process. What I want is to get a function in which a loop will run , will calculate aggregate and then write in a sheet in the workbook; do this for all 50 times(variables)

package:- WriteXLS was used by me.

Utsav Bhargava
  • 121
  • 2
  • 11

1 Answers1

0

Here's a solution that involves two steps: 1) getting your summary statistics for each variable as separate data frames in a list, and then 2) writing the elements of that list as separate sheets in an Excel workbook. I used iris as the test bed and used xlsx instead of WriteXLS because it makes appending sheets easier.

library(dplyr)
library(lazyeval)
library(xlsx)

# Start with a function to get your summary stats by id
aggregateit <- function(x) {
    require(dplyr)
    require(lazyeval)
    result <- iris %>%  # need to name your initial df here
        group_by(Species) %>%  # need to name your id var here
        summarise_(mean = interp(~mean(var), var = as.name(x)))  # See http://stackoverflow.com/questions/26724124/standard-evaluation-in-dplyr-summarise-on-variable-given-as-a-character-string
    return(result)
}

# Now apply that function to all desired variables --- here, all non-id columns in iris,
# which are columns 1 through 4 --- and then assign the variables names to the elements
# of that list.
agglist <- lapply(names(iris)[1:4], aggregateit)
names(agglist) <- names(iris)[1:4]

# Now write the data frames in that list to a workbook with nice names for the sheets
for (i in 1:length(agglist)) write.xlsx(agglist[[i]], file="filename.xlsx",
    sheetName=names(agglist)[i], append=TRUE)

And here's a version that uses aggregate in the initial function so you can do the whole thing in base R if you like:

aggregateit <- function(variable) {
    var.mean <- aggregate(variable ~ Species, iris, FUN=function(x) mean(x))  # need to swap in your df and id var names
    var.sd <- aggregate(variable ~ Species, iris, FUN=function(x) sd(x)) # ditto
    result <- merge(var.mean, var.sd, by = "Species")  # again for id var
    names(result) <- c("Species", "mean", "sd")  # once more
    return(result)
}

agglist <- lapply(iris[,1:4], aggregateit)

for (i in 1:length(agglist)) write.xlsx(agglist[[i]], file="iris.sumstats.xlsx",
    sheetName=names(agglist)[i], append=TRUE, row.names = FALSE)
ulfelder
  • 5,305
  • 1
  • 22
  • 40