-3

Sorry for the confusing first post. I have edited for clarity and included some sample data.

Clarified summary of problem: I have an Excel spreadsheet has a row for every student registered at the university for each term since fall 2010 with the following information about each student in columns: Term, Campus, College, Major, Gender, Ethnicity, Age.

My goal is to be able to generate a print ready report out of R that will build some output quality tables for printing. I don't care if they are PDF, HTML, etc., as long as I can print them and they are somewhat attractive. So far, I have imported the spreadsheet into R as a CSV I have been attempting to do this with the "GridExtra" library with some success.

I have 3 problems thus far: 1. If the count for a cell in the table is zero, it does not appear in the table; 2. I am unable to understand how to create more complex tables: for example a table that; 3. I am not able to create a column and row total.

An example table is shown below:

       ------Campus S-------|---------Campus M-----|-----Campus O------
       2010    2011    2012   2010    2011    2012  2010    2011    2012   Total column

COE

    A
    B
    C

COBA

    A
    B
    C  

Totals -->

Thus far my efforts have been something like this (small sample dataset):

Term <- c("Fall 2010", "Fall 2010", "Fall 2011", "Fall 2011", "Fall 2011", "Fall 2011", "Fall 2010", 
          "Fall 2010", "Fall 2011", "Fall 2011", "Fall 2011", "Fall 2011")
Campus <- c("S", "M", "O", "O", "S", "S", "O", "S", "S", "O", "S", "S")
College <- c("COE", "COBA", "COBA", "COLFA", "COE", "COBA", "COBA", "COBA", "COBA", "COBA", "COBA", "COLFA")
Major <- c("A", "B", "C", "A", "C", "C", "A", "C", "C", "A", "C", "C")
Gender <- c("M", "F", "F", "F", "F", "M", "F", "F", "M", "F", "F", "M")
Ethnicity <- c("B", "W", "W", "B", "B", "W", "B", "W", "W", "B", "W", "W")
Age <- c(25, 27, 44, 62, 23, 36, 42, 44, 55, 65, 33, 20)
mydata <- data.frame(Term, Campus, College, Major, Gender, Ethnicity, Age)
mydata

termxcamp.table <- table(mydata$Term, mydata$Campus)
termxcoll.table <- table(mydata$Term, mydata$College)

library(gridExtra)
plot.new()
grid.table(termxcamp.table)
plot.new()
grid.table(termxcoll.table)
Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
user3799924
  • 83
  • 2
  • 6
  • 4
    First of all, welcome to the world of R! Your Google drive file is not publicly accessible though. Even better would be if you could provide a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) including an exact specification of what result you want. I'm sure it is an issue that can be solved fairly easily. – thelatemail Jul 03 '14 at 03:10
  • Thanks for pointing out that the google file share was set incorrectly. It should work now. https://drive.google.com/file/d/0B-h2k3sWKyzSYVh2RUhIZ3pwOEU/edit?usp=sharing – user3799924 Jul 03 '14 at 03:15
  • 1
    Please include the data in the question itself, not on an external resource. To be applicable for stackoverflow, a question must be of use for future visitors. Your google drive data may disappear in the future, negating the value of this question. – Matthew Lundberg Jul 03 '14 at 03:20
  • Thanks for the replies and suggestions for making the post better. I believe I have added some clarity as well as some sample data and code that is giving me problems. – user3799924 Jul 03 '14 at 12:25

1 Answers1

1

Welcome to R. I think I understand what you want. First export your excel tables as csv files, then in R:

db<-read.csv("/path/to/file.csv", header=T)

This will import the csv file into R as a dataframe (matrix with multiple data types allowed). Then you want to check out the apply() function, which operates in dataframes/matrices by row or columns. Since your variables are columns, you will want to operate by columns, something like:

means.by.col<-apply(db, 2, mean)

This will create a vector of the same size as your number of columns with the means for each variable. Make sure though you don't try to apply the mean to a column with characters, because it will obviously spit out an error.

xv70
  • 922
  • 1
  • 12
  • 27