0

I am looking for the most convenient way of creating boxplots for different values and groups read from a CSV file in R.

First, I read my Sheet into memory:

Sheet <- read.csv("D:/mydata/Table.csv",  sep = ";")

Which just works fine.

names(Sheet) 

gives me correctly the Headlines of the different columns.

I can also access and filter different groups into separate lists, like

myData1 <- Sheet[Sheet$Group == 'Group1',]$MyValue
myData2 <- Sheet[Sheet$Group == 'Group2',]$MyValue
...

and draw a boxplot using

boxplot(myData1, myData2, ..., main = "Distribution")

where the ... stand for more lists I have filled using the selection method above.

However, I have seen that using some formular could do these steps of selection and boxplotting in one go. But when I use something like

boxplot(Sheet~Group, Sheet)

it won't work because I get the following error:

invalid type (list) for variable 'Sheet'

The data in the CSV looks like this:

No;Gender;Type;Volume;Survival
1;m;HCM;150;45
2;m;UCM;202;103
3;f;HCM;192;5
4;m;T4;204;101
...

So i have multiple possible groups and different values which I'd like to represent as a box plot for each group. For example, I could group by gender or group by type.

How can I easily draw multiple boxes from my CSV data without having to grab them all manually out of the data?

Thanks for your help.

Regenschein
  • 1,514
  • 1
  • 13
  • 26

2 Answers2

2

Try it like this:

Sheet <- data.frame(Group = gl(2, 50, labels=c("Group1", "Group2")),
                    MyValue = runif(100))
boxplot(MyValue ~ Group, data=Sheet)
lukeA
  • 53,097
  • 5
  • 97
  • 100
  • @fordprefect It just creates a good guess of what your data might look like, which you didn't provide in your post. (It's considered good practice to do so, see here: http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example, makes life easier & gets you more responses.) – lukeA Feb 23 '14 at 21:31
  • thanks. I added some example data to my question. I'd like to access some group and some values in a more convenient way, instead of having to get the subsequence of each group 'manually' – Regenschein Feb 23 '14 at 21:40
  • 1
    @fordprefect Using the example data, you could e.g. do `boxplot(Volume ~ Gender, data=Sheet)` (volume by 1 factor) or `boxplot(Volume ~ Gender + Type, data=Sheet)` (volume by 2 factors combined) etc. – lukeA Feb 23 '14 at 21:51
  • aah, that's it! I just didn't know that values and group have to be written that way... thank you! – Regenschein Feb 23 '14 at 21:54
1

Using ggplot2:

ggplot(Sheet, aes(x = Group, y = MyValue)) +
  geom_boxplot()

The advantage of using ggplot2 is that you have lots of possibilities for customizing the appearance of your boxplot.

Jaap
  • 81,064
  • 34
  • 182
  • 193
  • This one works as described, thank you! However, I am also looking for a solution which describes the steps to do something like a "group by" on a Sheet using a usual boxplot... – Regenschein Feb 23 '14 at 21:47