2

I have a dataframe test like below:

dput(test)

structure(list(Groups = c("Group1", "Group2", "Group3", "Group4", 
"Group5", "Group6", "Group7", "Group8", "Group9", "Group10", 
"Group11", "Group12", "Group13", "Group14", "Group15", "Group16", 
"Group17", "Group18", "Group19", "Group20", "Group21", "Group22"
), Disease = c("Brain", "Brain", "Brain", "Blood", "Blood", "Esophagus", 
"Esophagus", "Esophagus", "Brain", "Brain", "OE", "control", 
"OE", "control", "OE", "control", "PE", "PE_X", "PE_X", "PE", 
"PE_X", "PE"), variable = c("Name", "Name", "Name", "Name", "Name", 
"Name", "Name", "Name", "Name", "Name", "Name", "Name", "Name", 
"Name", "Name", "Name", "Name", "Name", "Name", "Name", "Name", 
"Name"), value = c(1.079825876, 0.236961206, 0.286498286, 0.374978442, 
3.620160544, 1.876875376, 0.293402656, 0.176208121, 0.622282653, 
1.373705338, 9.235592994, 1.437889832, 8.70900915, 1.772903362, 
9.070885831, 1.792899823, 10.29580836, 1.373281466, 4.210242765, 
0, 7.331498976, 14.11415563)), class = "data.frame", row.names = c(NA, 
-22L))

Using below code I made a boxplot:

ggplot(data= subset(test, variable == "Name")) + 
  geom_boxplot(aes(x=Disease, y=value, fill=Disease), outlier.shape=NA) +
  geom_jitter(aes(x=Disease, y=value, fill=Disease), position=position_dodge(0.2)) +
  theme_classic(base_size = 12) + xlab("") + ylab("value")

enter image description here

Using the below code I reordered the x-axis names (Disease) based on the column value.

ggplot(data= test) + 
  geom_boxplot(aes(x=reorder(Disease,value, na.rm=TRUE), y=value, fill=Disease), outlier.shape=NA) +
  geom_jitter(aes(x=Disease, y=value, fill=Disease), position=position_dodge(0.2)) +
  theme_classic(base_size = 12) + xlab("") + ylab("value")

enter image description here

Question: I would like to reorder only Blood, Brain, Eophagus Diseases on x-axis and keep the rest of the Diseases on the right side of the plot. How to do this?

beginner
  • 1,059
  • 8
  • 23
  • 2
    You can always transform the Disease variable to factor and change the order manually by writing the levels desired_order <- c(the vector of the order you want) data$category <- factor(test$Disease, levels = desired_order) – Lucca Nielsen Jul 20 '23 at 19:28
  • I don't want to order all names of x-axis, want to order only few names based on the column value – beginner Jul 20 '23 at 20:23
  • Could you please tell what the exact order should be of names on your x-axis? – Quinten Jul 21 '23 at 07:58

1 Answers1

3

Update after clarification:

library(ggplot2)
library(dplyr)

# medians
medians <- test %>% 
  filter(Disease %in% c("Esophagus", "Brain", "Blood")) %>%
  group_by(Disease) %>%
  summarize(median = median(value, na.rm = TRUE)) %>%
  arrange(median) %>%
  pull(Disease)

# rest of the levels
rest <- setdiff(unique(test$Disease), medians)

# Combine and reorder
new_order <- c(medians, rest)

# transform disease column to factor with the new levels
test$Disease <- factor(test$Disease, levels = new_order)

ggplot(data = subset(test, variable == "Name")) + 
  geom_boxplot(aes(x = Disease, y = value, fill = Disease), outlier.shape = NA) +
  geom_jitter(aes(x = Disease, y = value, fill = Disease), position = position_dodge(0.2)) +
  theme_classic(base_size = 12) +
  xlab("") +
  ylab("value")

enter image description here

First answer: Here is a solution using fct_relevel from forcats package:

library(forcats)
library(dplyr)
library(ggplot2)

# your order
first_levels <- c("Blood", "Brain", "Esophagus")

# adjusting levels 
test %>% 
  mutate(Disease = fct_relevel(Disease, first_levels)) %>% 
  filter(variable == "Name") %>%
  ggplot() +
  geom_boxplot(aes(x = Disease, y = value, fill = Disease), outlier.shape = NA) +
  geom_jitter(aes(x = Disease, y = value, fill = Disease), position = position_dodge(0.2)) +
  theme_classic(base_size = 12) + 
  xlab("") + 
  ylab("value")

enter image description here

TarJae
  • 72,363
  • 6
  • 19
  • 66
  • 1
    sorry, but I want to reorder Blood, Brain and Esophagus. In the above plot it doesn't look like they are ordered based on the value. – beginner Jul 20 '23 at 20:11
  • Please see my update! – TarJae Jul 20 '23 at 20:27
  • but what if I have multiple variables? the order might change right. Could you please tell what to do if I have multiple variables. – beginner Jul 20 '23 at 20:35
  • The order of the variables to be changed is defined here `medians <- test %>% filter(Disease %in% c("Esophagus", "Brain", "Blood")) %>%` by getting the medians of these defined variables we will order them by their medians, the rest will keep as it is ! – TarJae Jul 20 '23 at 20:37
  • no, what I mean is you can see column `variable` I have multiple names in that column, not just `Name`. There are many like `Name2`, `Name3`,`Name4` etc... until 100 – beginner Jul 20 '23 at 20:39
  • did you get my point? – beginner Jul 20 '23 at 20:52
  • no sorry. maybe you could explain? – TarJae Jul 20 '23 at 21:57