-1

I'm relatively new to R and want to make boxplots based on my data, in which each category will be plotted based on Price. My data is in the form:

data

For example I would like to have a boxplot of price vs storage where Storage would be categorized as 64GB, 256GB, other_GB, NA_GB. How to group those into one category "storage" would be useful as well. After an initial boxplot I could tell price and the other variables were scaled differently so I want to know how to make R recognize "1" for a variable like 64 GB means to count one 64 GB that sold at the corresponding price. Thanks for any help

Jaap
  • 81,064
  • 34
  • 182
  • 193
  • 1
    Please use `dput` to show the data instead of images as we can't copy it for testing thecode – akrun Mar 21 '18 at 07:12
  • Use the reshape2 library function melt() on your dataframe to convert it from wide to long with id.vars = "Price" to preserve the Price column. This should condense all those columns into many rows and 3 columns, and your life will be so much easier. – xyz123 Mar 21 '18 at 08:12

1 Answers1

0

You haven't provided any sample data, so I'm simulating some data similar to yours to demonstrate.

set.seed(2017);
df <- cbind.data.frame(
    Price = sample(85:200, 20) * 100,
    x64_GB = sample(c(0, 1), 20, replace = T),
    x256_GB = sample(c(0, 1), 20, replace = T),
    other_GB = sample(c(0 ,1), 20, replace = T));

library(tidyverse);
df %>%
    gather(key, value, 2:4) %>%
    filter(value > 0) %>%
    ggplot(aes(key, Price)) + geom_boxplot()

enter image description here

Explanation: Change your data from wide to long format, remove 0s, and use ggplot to show Price distribution per key as boxplot.

If you want to change the order of the boxplots, have a look at this post.

Maurits Evers
  • 49,617
  • 4
  • 47
  • 68