-3

I have a data frame like this:

2015 | 2016 | 2017 
-----+------+------
1.2  |  5.2 | 5.6
9.1  |  3.7 | 4.3
.../...

and I'd like to box plot the values of the column for each year.

The only way I found is to manually change the data frame (imported from CSV file) into:

Year | Value
-----+-------
2015 | 1.2
...  | ...
2016 | 5.2
...  | ...
2017 | 5.6
...  | ...

and use (if CPV is the name of my data frame):

ggplot(CPV, aes(x=factor(Year), y=Time)) + geom_boxplot()

First question: can I get the same result with the first form of data frame?

Second question: can I have the same plot if I have a "Value" column, and for each year, the count of occurences of this value:

Value | 2015 | 2016 | ...
------+------+------+---
1     |    0 |    5 | ...
1.1   |    4 |    1 | ...

ANSWER

  1. Use the second form of the table for the box plot.

  2. To get to this form, use the melt function from the data.table package.

gregseth
  • 12,952
  • 15
  • 63
  • 96
  • Please include [reproducible data](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – lmo Sep 27 '17 at 11:44
  • The answer to the first question is NO. The second form which you have shown is the `melt` form of data (and that is the right way to do). For the second question, once you have the occurrences, you can again melt the data and create boxplot. Do you need the `value` column as well? – Hardik Gupta Sep 27 '17 at 11:46
  • Possible duplicate of the first part: [Plotting two variables as lines using ggplot2 on the same graph](https://stackoverflow.com/questions/3777174/plotting-two-variables-as-lines-using-ggplot2-on-the-same-graph). The second question should be, well, a second question. – Henrik Sep 27 '17 at 11:47
  • @Hardikgupta The `value` column represents the Y-Axis of the plot, but I don't need them otherwise. – gregseth Sep 27 '17 at 11:49
  • @gregseth: you can then melt the dataframe (from library reshape) and use the occurrence data only to create boxplot – Hardik Gupta Sep 27 '17 at 11:50
  • @Hardikgupta I think I got the melt part working, but not sure how to use the occurrence data only to create boxplot. – gregseth Sep 27 '17 at 18:48
  • @lmo Since my questions are about how to use some R functions, any data will do. There's no «error» to reproduce. – gregseth Sep 27 '17 at 18:50
  • @Henrik I'll reformulate my second question then: How, from the input data shown in the last table, can I get the same boxplot than with the code line I provided? Or, in other words should I transform the data frame #3 to make it look like #2, or is there some way to use it as is, with some ggplot parameters? – gregseth Sep 27 '17 at 18:55

1 Answers1

1
df <- data.frame("2015" = c(1.2, 9.2),
             "2016" = c(5.2, 3.7),
             "2017" = c(5.6, 4.3))
names(df2) <- c("Year", "Value")             # new
# setnames(df, as.character(c(2015:2017)))   # old

reshape2 will do the transformation of the data frame:

library(reshape2)
df2 <- melt(df)
setnames(df2, c("Year", "Value"))
df2

Then you can generate the boxplot as you did before:

ggplot(df2, aes(x = factor(Year), y = Value)) + geom_boxplot()
Samuel Reuther
  • 101
  • 1
  • 3
  • `setnames` is not a base R function. Please list any packages you are using in your answer. – lmo Sep 27 '17 at 12:01