1

I am new to R and am trying to make a series of plots.

I have a table similar to the following (but much larger):

CITY        LOCATION    NUMBER_OF_PUBS  BEER
Cardiff     Wales       100             Brains
Newport     Wales       50              Brains
Aberystwyth Wales       400             Brains
Edinburgh   Scotland    220             Belhaven
St_Andrews  Scotland    20              Belhaven
Aberdeen    Scotland    800             Belhaven
Bath        England     500             London_Pride
London      England     10              London_Pride
Bristol     England     200             London_Pride
Birmingham  England     100             London_Pride
Dublin      Ireland     250             Guinness
Cork        Ireland     60              Guinness
Galway      Ireland     750             Guinness
Limerick    Ireland     150             Guinness

I am trying to plot a box and whisker plot of location against number of pubs:

Box and whisker plot of pubs

But I want to just plot a subset of these locations. For example just those found in Wales and Scotland from the location column.

I have found this thread: GGPLOT2: how to plot specific selections inthe ggplot() script

and tried this:

t <- as.data.frame(pubs)
ggplot(t, aes(NUMBER_OF_PUBS, LOCATION , fill = factor(BEER))) +
  geom_boxploth(t=subset(t,COHORT="Wales", "Scotland"))

But seem to be getting no where...

I am sure its very simple but just can't seem to work it out. Any help would be greatly appreciated. Thank you.

hdjc90
  • 77
  • 6

1 Answers1

2
library(tidyverse)

df <- structure(list(CITY = c("Cardiff", "Newport", "Aberystwyth", 
"Edinburgh", "St_Andrews", "Aberdeen", "Bath", "London", "Bristol", 
"Birmingham", "Dublin", "Cork", "Galway", "Limerick"), LOCATION = c("Wales", 
"Wales", "Wales", "Scotland", "Scotland", "Scotland", "England", 
"England", "England", "England", "Ireland", "Ireland", "Ireland", 
"Ireland"), NUMBER_OF_PUBS = c(100L, 50L, 400L, 220L, 20L, 800L, 
500L, 10L, 200L, 100L, 250L, 60L, 750L, 150L), BEER = c("Brains", 
"Brains", "Brains", "Belhaven", "Belhaven", "Belhaven", "London_Pride", 
"London_Pride", "London_Pride", "London_Pride", "Guinness", "Guinness", 
"Guinness", "Guinness")), class = "data.frame", row.names = c(NA, 
-14L))

df %>%
  #use filter to subset the rows based on the LOCATION column values
  filter(LOCATION == c("Wales", "Scotland")) %>% 
  ggplot(aes(NUMBER_OF_PUBS, LOCATION , fill = factor(BEER))) +
  geom_boxplot()

Created on 2020-08-20 by the reprex package (v0.3.0)

Eric
  • 2,699
  • 5
  • 17