0

I have such a data

> dput(x)
structure(list(gene = c("14q", "20q", "18q", "4q", "21p", "21q", 
"5q", "22q", "17p", "3p", "9p", "4p", "9q", "19q", "10q", "15q", 
"16p", "19p", "1p", "18p", "16q", "8p"), CNV = c("Deletion", 
"Amplification", "Deletion", "Deletion", "Deletion", "Deletion", 
"Deletion", "Deletion", "Deletion", "Deletion", "Deletion", "Deletion", 
"Deletion", "Deletion", "Deletion", "Deletion", "Deletion", "Deletion", 
"Deletion", "Deletion", "Deletion", "Deletion"), log10_pvalue = c(1.197226275, 
1.88941029, 5.974694135, 5.73754891, 4.995678626, 4.970616222, 
4.793174124, 4.793174124, 4.109020403, 3.524328812, 3.524328812, 
2.823908741, 2.567030709, 2.186419011, 1.769551079, 1.59345982, 
1.59345982, 1.59345982, 1.416801226, 1.195860568, 1.094743951, 
1.087777943), Percentage_altered = c(3000, 5000, 6100, 5300, 
6100, 5600, 4400, 5000, 5000, 4400, 5000, 4700, 3900, 2800, 3300, 
3100, 3300, 3100, 2200, 3600, 3300, 3300), group = c("Responders", 
"Responders", "Non-responders", "Non-responders", "Non-responders", 
"Non-responders", "Non-responders", "Non-responders", "Non-responders", 
"Non-responders", "Non-responders", "Non-responders", "Non-responders", 
"Non-responders", "Non-responders", "Non-responders", "Non-responders", 
"Non-responders", "Non-responders", "Non-responders", "Non-responders", 
"Non-responders")), row.names = c(NA, -22L), class = "data.frame")

I have used this code

> p=x %>% 
Warning messages:
1: ggrepel: 6 unlabeled data points (too many overlaps). Consider increasing max.overlaps 
2: ggrepel: 6 unlabeled data points (too many overlaps). Consider increasing max.overlaps 
+     mutate(net_frequency=ifelse(CNV == "Deletion", -Percentage_altered/100, Percentage_altered/100),
+            log10_pvalue = if_else(CNV == "Deletion", log10_pvalue, log10_pvalue)) %>% 
+     ggplot(aes(x = log10_pvalue, y = net_frequency, color = log10_pvalue)) +
+     geom_point(aes(size=Percentage_altered)) +
+     geom_text_repel(aes(label=gene), force=15) +
+     geom_hline(yintercept=0, lty=2) +
+     scale_color_distiller(type = "div",palette = 5) +
+     theme_classic() +
+     facet_wrap(~group)
> p+xlab("-log10(qvalue)")+ylab("Net frequency of gain and deletion (%)")+theme(
+     plot.title = element_text(color="black", size=14, face="bold.italic"),
+     axis.title.x = element_text(color="black", size=14, face="bold"),
+     axis.title.y = element_text(color="black", size=14, face="bold")
+ )+theme(axis.text.x = element_text(face="bold", color="black", 
+                                    size=14),
+         axis.text.y = element_text(face="bold", color="black", 
+                                    size=14))

To plot something like this

enter image description here

I have some problems here

1- You are seeing here, chromosomal arm size like 1p, 14q, whetever are too small too read even color causes some arms are not readable. How I can make the text size of these elements bigger please ?

2- How I can remove negative sign from Y xis (60 and 30 instead of -30 and -60)?

3- How I can change legend titles from Percentage_altered to Frequency altered and log10_pvalue to -log10(qvalue) ?

Thank you so much in advance

stefan
  • 90,330
  • 6
  • 25
  • 51
user6517
  • 33
  • 5
  • Can you please post "correct" code? `p=x %>%` followed by some warnings is wrong. – r2evans Jan 08 '21 at 18:38
  • What is `if_else(CNV == "Deletion", log10_pvalue, log10_pvalue)` supposed to be doing? – r2evans Jan 08 '21 at 18:41
  • While I like @MarBlo's code for improving the `geom_text_repel`, the three questions you asked (text size, label format, and legend title) are all duplicates. I've suggested an answer that addresses these three literal questions, but will mark it as a duplicate in order to promote the value of the existing answers (a preferred thing in "Stack"). – r2evans Jan 08 '21 at 18:57

2 Answers2

1
  1. geom_repel_text(..., size = 15)

  2. Write a "labeller" function that uses abs.

    Note: perhaps you meant to negate all instead of just absolute value? In that case, you can use `-` instead of abs, or you can use scale_y_reverse.

  3. scale_*_continuous(name="quux", ...)

p <- x %>%
  mutate(
    net_frequency = if_else(CNV == "Deletion", -Percentage_altered/100, Percentage_altered/100),
    log10_pvalue = if_else(CNV == "Deletion", log10_pvalue, log10_pvalue) # ??? kept
  ) %>% 
  ggplot(aes(x = log10_pvalue, y = net_frequency, color = log10_pvalue)) +
  geom_point(aes(size = Percentage_altered)) +
  geom_text_repel(aes(label = gene), size = 15, force = 15) +
  #                                  ^^^^ 1
  geom_hline(yintercept = 0, lty = 2) +
  scale_y_continuous(labels = abs) +
  #                  ^^^^^^ 2
  scale_color_distiller(name = "-log10(qvalue)", type = "div",palette = 5) +
  #                     ^^^^ 3
  scale_size_continuous(name = "Frequency Altered") +
  #                     ^^^^ 3
  theme_classic() +
  facet_wrap(~ group)

p +
  labs(x = "-log10(qvalue)", y = "Net frequency of gain and deletion (%)") +
  theme(
    plot.title = element_text(color="black", size=14, face="bold.italic"),
    axis.title.x = element_text(color="black", size=14, face="bold"),
    axis.title.y = element_text(color="black", size=14, face="bold"),
    axis.text.x = element_text(face="bold", color="black", size=14),
    axis.text.y = element_text(face="bold", color="black", size=14)
  )

ggplot2 adjusted

(You'll need to find an appropriate size= for your report media, exaggerated here solely for emphasis.)

Incidentally, Stack's search engine can be tweaked to give informed results.

  1. [r] ggplot2 text size produced ggplot geom_text font size control
  2. [r] ggplot2 axis label format produced Custom ggplot2 axis and label formatting
  3. [r] ggplot2 legend title produced How to change legend title in ggplot
r2evans
  • 141,215
  • 6
  • 77
  • 149
  • Thank you very much @r2evans, works smartly. As you see in the frequenctly aletred the circles pointing to 3000, 4000, 5000, 6000 while the percntage in my data is from 0 to 100%. How I can edit the circle lables to 30, 40, 50, 60? – user6517 Jan 08 '21 at 20:33
  • I think the best way would be to change it before sending it to plot, as in `mutate(Percentage_altered=Percentage_altered/100)`, and then you should change your `ifelse(CNV == "Deletion",...)` later to deal with already-reduced data. I think that's far simpler than messing around with labeling and such. – r2evans Jan 08 '21 at 21:16
1

Here are some answers to your question.

Although I am not sure why you want to delete the minus-sign on the y-axis.

For geom_tex_repel you may play with nudge_.. and overlap and you should have a look to the phantasmic package vignette here.

Changes for scale and legend.title are commented in the code.

Maybe you can respond with the rational behind taking the minus signs out.


library(tidyverse)
library(ggrepel)

df %>%
  mutate(
    net_frequency = ifelse(CNV == "Deletion",
      -Percentage_altered / 100, Percentage_altered / 100
    ),
    log10_pvalue = if_else(CNV == "Deletion", log10_pvalue, log10_pvalue)
  ) %>%
  ggplot(aes(x = log10_pvalue, y = net_frequency, color = log10_pvalue)) +
  geom_point(aes(size = Percentage_altered)) +
  # play with nudge and overlap
  geom_text_repel(aes(label = gene), force = 15, nudge_x = 5, max.overlaps = Inf) +
  geom_hline(yintercept = 0, lty = 2) +
  scale_color_distiller(type = "div", palette = 5) +
  # here you can change the tick labels
  scale_y_continuous(
    breaks = seq(-60, 60, 30),
    labels = c("60", "30", "0", "30", "60")
  ) +
  theme_classic() +
  # here you can change title of legend
  guides(size = guide_legend(title = "Frequency altered")) +
  facet_wrap(~group)

MarBlo
  • 4,195
  • 1
  • 13
  • 27