1

Currently stuck on a question assigned in class, we have a dataframe with 3 groups and 3 explanatory variables. Tried my best to input a bit of the dataset below

diet <-    Group     useful   Difficulty   Importance
1         Website   19.6     5.15         9.5
2         Website   15.4     5.75         3.3
3         Nurse     22.3     4.35         5.0
4         Nurse     24.3     7.55         6.0
5         Video     22.5     8.50         18.8
6         Video     14.1     6.30         16.5

Just wondering how you would go about creating boxplots for this set of data? I would assume i would use facetting to an extent but unsure of the rest.

so far this is what i've tried..thought it is probably wrong

ggplot(diet,aes(x = importance, y = useful )) +geom_boxplot() +facet_wrap(~group, scales = "free")

output of graph

BK0082
  • 13
  • 3
  • 1
    What have you tried? Provide example data and expected output. – zx8754 Mar 10 '20 at 23:29
  • @zx8754 sorry about that! just edited the post with some more info – BK0082 Mar 10 '20 at 23:45
  • 1
    Please provide a reproducible example of your dataset (https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). To guide you toward one solution, take a look at the reshaping using `pivot_longer` function (https://tidyr.tidyverse.org/reference/pivot_longer.html) – dc37 Mar 10 '20 at 23:50
  • @dc37 will do! sorry for the trouble, first time on stackoverflow – BK0082 Mar 10 '20 at 23:53
  • No problem ;) Here, people tend to not provide easy solutions when it is about homeworks. It's better learning if you have to think first at your problem and then ask some help for few details or alternative solutions – dc37 Mar 10 '20 at 23:56
  • 1
    Your plot is wrong because the x-axis in a boxplot has no numeric significance. It should represent the group only. Consider reshaping the data into a longer format, where you have three variables: one for group (A, B, C), one for the measurement type (usefulness, importance, difficulty) and one for the value. You'll find it easier to graph this. – Edward Mar 11 '20 at 00:01
  • @dc37 haha yeah, am definitely still an R nooby but its been really fun learning it! though it would be nice if someone did all my homework.. i wouldn't be learning anything at all – BK0082 Mar 11 '20 at 00:01
  • @Edward thank you! will try some pivot_longer stuff on this – BK0082 Mar 11 '20 at 00:10

1 Answers1

0

Your best bet is to tidy (reshape) the data into an appropriate format for further analysis or visualisation. Here, you've got repeated measurements (usefulness, importance, difficulty). So first gather them all into one column:

diet2 <- pivot_longer(diet, cols=-group)  # Previously called gather

Alternatively,

diet2 <- pivot_longer(diet, cols=c(usefulness, importance, difficulty))  

You should get a longer data frame (called a "tbl" in tidyverse). Have a look at it.

Then create some boxplots. Given this is homework, I will not provide the solution, but let the OP learn DataCamp style. :)

library(ggplot2)

ggplot(diet2, aes(x = ____, y = ____)) + 
  geom_boxplot() +
  facet_wrap(~____, scales = "free")

Replace ____ with names of the variables.

Edward
  • 10,360
  • 2
  • 11
  • 26
  • Everything exactly as you said and got this :) [graph](https://imgur.com/a/3AAh5pc) – BK0082 Mar 11 '20 at 00:22
  • Very good. A+ :) You can also swap the `name` and the `group` variables and see which one suits your objective better. – Edward Mar 11 '20 at 00:26