0

I have measurements over two seasons (summer and winter) of one parameter:Formaldehyde with different sensors( A1, B2,B3,...F21). Data looks like this:

sensor=A1,A1,A2,A2,A3,A3
Formaldehyde=21.3,34.2,55,66.3,70.8,90
Season= summer,winter,summer,winter,summer,winter

I am trying to make a geom_boxplot divided into two facets( one for each season) and I want the sensors to be ordered by increasing median (independently in each season) I have tried:

fac <- with(DATALL, reorder(sensor,Formaldehyde, median, order = TRUE))
DATALL$sensors <- factor(DATALL$sensor, levels = levels(fac))

a <- ggplot(DATALL,aes(sensors,Formaldehyde, fill=sensors)) + 
  geom_jitter(position=position_jitter(width=0.3, height=0.2), aes(colour=factor(sensors)), alpha=0.4) +
  geom_boxplot(outlier.shape = NA)+
  facet_wrap(~season, scales='free', ncol=1)+
  scale_y_continuous(limits = quantile(DATALL$Formaldehyde, c(0.1, 0.98)))
a<-a+labs(x="",y=expression(Formaldehyde~(30~min)~(µg/ m^{3})))+theme(legend.position = "none")

But this orders the sensors considering the median of both seasons and not for each facet independently. Do you have any advice?

Maria
  • 3
  • 2
  • Hello Maria, can you submit a dataframe with similar structure as your DATALL please, to help users reproduce your issue? https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Rosalie Bruel Aug 23 '21 at 11:28

2 Answers2

0

I think you could use some code from this post: https://drsimonj.svbtle.com/ordering-categories-within-ggplot2-facets

DATALL <- data.frame(sensors=c("A1","A1","A2","A2","A3","A3","A1","A1","A2","A2","A3","A3","A1","A1","A2","A2","A3","A3"),
                     Formaldehyde=c(21.3,34.2,55,90,66.3,71.8,22.3,44.2,65,90,69.3,79.8,25.3,35.2,45,70,56.3,80.8),
                     season= c("summer", "winter"))


DATALL <- merge(
  # merge your dataset...
  DATALL %>% 
    # adding a column with season and sensor index
    unite(index, c("season", "sensors")),
  # ... to a column giving the order of sensors by season
  DATALL %>%
    group_by(season, sensors) %>%
    summarise(Formaldehyde = mean(Formaldehyde, na.rm = T)) %>%
    arrange(Formaldehyde) %>%
    mutate(order = row_number()) %>% arrange(order) %>% 
    # create the common row index
    unite(index, c("season", "sensors")) %>% select(index, order)) %>%
  separate(index, c("season", "sensors"))


ggplot(DATALL,aes(factor(order, levels = min(order):max(order)),Formaldehyde, fill=sensors)) + 
  geom_jitter(position=position_jitter(width=0.3, height=0.2), aes(colour=factor(sensors)), alpha=0.4) +
  geom_boxplot(outlier.shape = NA) +
  facet_wrap(~season, scales='free', ncol=1) +
  scale_x_discrete(
    breaks = DATALL$order,
    labels = DATALL$sensors,
    expand = c(0,0))DATALL <- data.frame(sensors=c("A1","A1","A2","A2","A3","A3","A1","A1","A2","A2","A3","A3","A1","A1","A2","A2","A3","A3"),
                     Formaldehyde=c(21.3,34.2,55,90,66.3,71.8,22.3,44.2,65,90,69.3,79.8,25.3,35.2,45,70,56.3,80.8),
                     season= c("summer", "winter"))


DATALL <- merge(
  # merge your dataset...
  DATALL %>% 
    # adding a column with season and sensor index
    unite(index, c("season", "sensors")),
  # ... to a column giving the order of sensors by season
  DATALL %>%
    group_by(season, sensors) %>%
    summarise(Formaldehyde = mean(Formaldehyde, na.rm = T)) %>%
    ungroup() %>% arrange(season, Formaldehyde) %>%
    mutate(order = row_number()) %>% arrange(order) %>% 
    # create the common row index
    unite(index, c("season", "sensors")) %>% select(index, order)) %>%
  separate(index, c("season", "sensors"))


ggplot(DATALL,aes(factor(order, levels = min(order):max(order)),Formaldehyde, fill=sensors)) + 
  geom_jitter(position=position_jitter(width=0.3, height=0.2), aes(colour=factor(sensors)), alpha=0.4) +
  geom_boxplot(outlier.shape = NA) +
  facet_wrap(~season, scales='free', ncol=1) +
  scale_x_discrete(
    breaks = DATALL$order,
    labels = DATALL$sensors,
    expand = c(0,0)) +
  labs(x = "sensors")

enter image description here

Rosalie Bruel
  • 1,423
  • 1
  • 10
  • 22
0

Thanks a lot Rosalie, Unfortunately , when I use your code , it does not work equally well when adding more data. If I add some extra data:

DATALL <- data.frame(sensors=c("A1","A1","A2","A2","A3","A3","A1","A1","A2","A2","A3","A3","A1","A1","A2","A2","A3","A3"),
                     Formaldehyde=c(21.3,34.2,55,90,66.3,71.8,22.3,44.2,65,90,69.3,79.8,25.3,35.2,45,70,56.3,80.8),
                     season= c("summer", "winter"))

the figure that is output with your code looks like this: enter image description here

Maria
  • 3
  • 2