0

I have a data frame defined somewhat as follows (there are really 200+ variables):

class_level var1 var2
          1    4    7
          1    6    7
          1    7    6
          4    3    1
          4    1    3

where class_level is either 1 or 4, and var1 and var2 have values of 1-7 which are likert-like response levels from a survey.

I want a data frame that includes counts by level for each var by class level, with a totals row for each variable, something like:

class_level variable Levels Students
          1     var1      1       10  
          1     var1      2        7  
          1     var1      3       28  
          1     var1      4       15  
          1     var1      5       54  
          1     var1      6       38  
          1     var1      7       16
          1     var1  Total      168  
          4     var1      1       58
          .        .      .        .
          .        .      .        .
          .        .      .        .
          4     var1      7       33
          4     var1  Total      294  

I have tried the following:

df.m <- melt( df, id.vars=c("class_level"), na.rm=TRUE )
head(df.m)
#  class_level variable value
#1           4     var1     4
#2           4     var1     6
#3           1     var1     7
#4           4     var1     3
#5           1     var1     5
#6           4     var1     6

df.c <- dcast( df.m, class_level+variable ~ value,
               fun.aggregate=length, 
               subset=.(variable %in% c("var1","var2")),
               margins=TRUE
             )
head(df.c)
#  class_level variable 1 2  3  4  5   6   7 (all)
#1           1     var1 1 1  8 24 56 101  32   223
#2           1     var2 2 4  4 22 49  79  56   216
#3           4     var1 4 5 11 38 91 114  76   339
#4           4     var2 2 6 11 35 73  98 106   331

df.o <- melt( df.c, id.vars=c("class_level","variable"),
              variable.name="Levels", value.name="Students"
            )
head(df.o)
#  class_level variable Levels Students
#1           1     var1      1        1
#2           1     var2      1        2
#3           4     var1      1        4
#4           4     var2      1        2
#5           1     var1      2        1
#6           1     var2      2        4

As you can see, this produces counts by levels for each variable, but no totals rows. How do I get the totals rows in the final dataset (df.o). Any help would be greatly appreciated.

David

dmonder
  • 353
  • 4
  • 12
  • Please provide a [good reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) along with desired outcome. It's not 100% clear to me what you're looking to do right now. – Dason Nov 02 '12 at 19:58
  • 1
    Possibly you're just missing a `)` in the `dcast` call? – joran Nov 02 '12 at 20:07
  • @joran - Yes, corrected. – dmonder Nov 02 '12 at 20:35
  • 2
    It works for me with `)` included but the total rows are all at the bottom. You just have to sort the data frame. – joran Nov 02 '12 at 20:44
  • @Dason, I added some more detail to my example. Is this helpful? – dmonder Nov 02 '12 at 20:44

1 Answers1

1

I would be inclined to use plyr to help:

df.m <- melt( df, id.vars=c("class_level"), na.rm=TRUE )
df.m$value <- factor(df.m$value, levels=1:7) # To ensure 0 counts as well
df.c <- ddply(df.m, .(class_level, variable),
             function(x) c(table(df.m$value), Total=length(df$m.value)))
df.o <- melt(df.c, id.vars=c("class_level", "variable"))
seancarmody
  • 6,182
  • 2
  • 34
  • 31