1

I'm quite new to R and I've been working at this problem for some time and could use some help. I have a CSV file (called cleaned_doc) which contains the rating and percent of a product sold per country, I would like to compare 2 countries in a faceted scatterplot with 2 facets for all of the product sold in those particular countries of origin.

When I try doing it like this

ggplot(data = cleaned_doc, aes(Percent, Rating)) +
  geom_point() +
  labs( y = "Rating", x = "Percent") + 
  facet_grid( ~ Origin ) 

I get faceted scatter plots for all the countries listed. However I just want to compare 2 (for this example I picked Germany and France) there are more components I want to add later but those are mainly aesthetics so I just want to keep it simple for now. Based on what I've seen on Stackoverflow & other places I tried doing it like this

countries <- gather(cleaned_doc, key="measure", value="value", c(Origin["Germany"], Origin["France"]))

answer2 <- ggplot(data = cleaned_doc, aes(Percent, Rating)) +
  geom_point() +
  labs( y = "Rating", x = "Percent") + 
  facet_grid(~ countries )

However then I get the following error

Error: At least one layer must contain all faceting variables: `countries`.
* Plot is missing `countries`
* Layer 1 is missing `countries`

and I'm really not sure what that means and what I'm doing wrong. so I would really appreciate any help.

camille
  • 16,432
  • 18
  • 38
  • 60
  • 2
    You could just filter the data down to your desired countries instead, e.g., `cleaned_doc %>% filter(Origin %in% c("Germany", "France")) %>% ggplot(aes(Percent, Rating)) + ...`. – ulfelder Jan 21 '20 at 19:56
  • 1
    [See here](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) on making an R question that folks can help with. That includes a sample of data and all necessary code – camille Jan 21 '20 at 19:59
  • @ulfelder Thank you so much for your help!! that solved it :D – EatReadGameRepeat Jan 21 '20 at 20:04

1 Answers1

1

You simply need to create a data frame that doesn't have the other countries in it. You can use filter or just do it in base R like this:

cleaned_doc2 <- cleaned_doc[which(cleaned_doc$Origin == "Germany" | cleaned_doc$Origin == "France"),]
cleaned_doc2$Origin <- as.character(cleaned_doc2$Origin)

ggplot(data = cleaned_doc, aes(Percent, Rating)) +
  geom_point() +
  labs( y = "Rating", x = "Percent") + 
  facet_grid( ~ Origin )
Allan Cameron
  • 147,086
  • 7
  • 49
  • 87