0

I have a table with data on the sales volumes of some products. I want to build several boxplots for each product. I.e. vertically I have sales volume and horizontally I have days. When building, I do not build boxplots in certain values. What is the reason for this? Here is table:

Day Cottage cheese..pcs. Kefir..pcs. Sour cream..pcs.
1    1          99        103          111
2    2          86        101          114
3    3          92        100          116
4    4          87        112          120
5    5          86        104          111
6    6          88        105          122
7    7          88        106          118

Here is my code:

head(out1)# out1-the table above
boxplot(Day~Cottage cheese..pcs., data = out1)

Here is the result: enter image description here

sjr_25
  • 31
  • 8
  • Please paste reproducible example: `dput(out1)` – zx8754 Jun 04 '21 at 08:36
  • You have 5 unique values in cottage cheese column, so you have 5 boxplots? What is the expected output? – zx8754 Jun 04 '21 at 08:37
  • @ zx8754 I want to build a boxplot for each product, I didn't know how to do it for each product and started doing it separately – sjr_25 Jun 04 '21 at 08:43

2 Answers2

1

Try below:

# example data
out1 <- read.table(text = " Day Cottage.cheese Kefir Sour.cream
1    1          99        103          111
2    2          86        101          114
3    3          92        100          116
4    4          87        112          120
5    5          86        104          111
6    6          88        105          122
7    7          88        106          118", header = TRUE)

# reshape wide-to-long
outlong <- stats::reshape(out1, idvar = "Day", v.names = "value",
                          time = "product", times = colnames(out1)[2:4],
                          varying = colnames(out1)[2:4], direction = "long")

# then plot
boxplot(value~product, outlong)

enter image description here

zx8754
  • 52,746
  • 12
  • 114
  • 209
  • Can you please explain what this `stats::reshape` construct is and what it is for ? – sjr_25 Jun 04 '21 at 09:01
  • @sjr_25 it is called [reshaping from wide-to-long format](https://stackoverflow.com/q/2185252/680068). There are many other ways of doing it, see linked post. – zx8754 Jun 04 '21 at 09:03
  • @ zx8754 Why we need reshaping? I couldn't understand – sjr_25 Jun 04 '21 at 09:12
  • @sjr_25 does my plot match your expected output? We need reshaping because we need 2 values to plot: product types on x axis and sales volume (value) on the y axis. – zx8754 Jun 04 '21 at 09:15
  • @ zx8754 Yes, everything matches! Thank you for the explanation – sjr_25 Jun 04 '21 at 09:17
1

In addition to the provided answer, if you desire to vertically have sales volume and horitontally have days (using the out1 data provided by zx8754).

library(tidyr)
library(data.table)
library(ggplot2)

#data from wide to long
dt <- pivot_longer(out1, cols = c("Kefir", "Sour.cream", "Cottage.cheese"), names_to = "Product", values_to = "Value")

#set dt to data.table object
setDT(dt)

#convert day from integer to a factor
dt[, Day := as.factor(Day)]

#ggplot
ggplot(dt, aes(x = Day, y = Value)) + geom_bar(stat = "identity") + facet_wrap(~Product)

facet_wrap provides separate graphs for the three products.

Result

I created a bar chart here since boxplots would be useless in this case (every product has only one value each day)

maarvd
  • 1,254
  • 1
  • 4
  • 14