0

I have a dataset where I analysed SNPs, INDELs and SVs with two different tools DeepVariant and PanGenie. Now, DeepVariant only assesses SNPs and INDELs, whereas PanGenie is more tailor-suited for SVs. So, when I plot a comparison between the two tools I'm using against two different graph assemblies this is what I get enter image description here What I would like instead, it is a plot where both the SNPs and INDELs as well as the SVs facets stretch the entire length of the figure. I don't mind to use nested facets with DeepVariantand PanGenie on the left side of the plot over the type of variants they assess, respectively.
I've seen some similar examples but they have the same child facets within the parent one (see here).

Below, the code I used

library(grid)
library(ggh4x)
library(readxl)
library(scales)
library(ggdark)
library(ggpubr)
library(gtable)
library(ggplot2)
library(forcats)
library(reshape2)
library(ggchicklet) #round column
library(RColorBrewer)

excel <- read_excel("/media/mat/Extreme SSD/TheUniversityOfFerrara/3°Year/pangenie_giraffe-dv_orth_val.xlsx") # load spreadsheet

df <- data.frame(excel) # convert excel to dataframe

df$variant_type <- factor(df$variant_type, levels=c('SNPs', 'INDELs', 'SVs')) # order of variants to be displayed
df$metric <- factor(df$metric, levels=c('recall', 'precision', 'F1')) # order of metrics to be shown

df_2 <- with(df, df[order(variant_type, caller, graph, metric),]) # collapses fields with the same entry

### Personalized stripes
ridiculous_strips <- strip_themed(
  ## Horizontal strips
  background_x = elem_list_rect(fill=c("#f9ab00", "#ffaaaa")),
  text_x = elem_list_text(colour=c("black", "black"), face=c("bold", "bold")),
  by_layer_x = FALSE,
  
  ## Vertical strips
  background_y = elem_list_rect(fill = c("slategray3", "slategray3", "slategray3")),
  by_layer_y = FALSE
)

### PLOT orthogonal validation of variants callers and graphs used
variants_eval <- ggplot(df_2, aes(x=metric, y=value, fill=graph)) + geom_point(shape=21, alpha=.6, size=3)
ggplot(df_2, aes(x=metric, y=value, fill=graph)) + geom_point(shape=21, alpha=.6, size=3) +
  ggh4x::facet_grid2(variant_type ~ caller, scales='free', switch='y', independent='y', strip=ridiculous_strips) + scale_fill_manual(values=rev(brewer.pal(11, "RdBu")[c(1, 11)])) +
  guides(fill=guide_legend(title='assembly', title.position='top', title.hjust=.5, title.theme=element_text(face='italic'))) + ggtitle("variant_calling — effect of alignment tool and graph used") +
  theme_bw() + theme(plot.title=element_text(face='bold.italic', hjust=.5), legend.position='bottom') -> callers_graphs
callers_graphs

Thanks in advance!

Matteo
  • 17
  • 4
  • I think I don't understand the desired outcome despite what I believe must be an accurate description. Would you care adding a sketch of how you would want it to look like? And maybe, maybe create a fake data set, or use one of R's many many data sets to make this probably interesting question more reproducible? – tjebo Apr 22 '23 at 20:08
  • @tjebo I provided a simple drawing of what I intend to do and also a possible solution. However, as you can see, I'm not able to get both personalised stripes as well as the nested facets around them. Let me know, thanks! – Matteo Apr 23 '23 at 00:30
  • As you're already using ggh4x, you can use `strip_nested()` instead of `strip_themed()`. – teunbrand Apr 23 '23 at 06:38
  • Thanks @teunbrand I did exactly that and switched to `facet_nested_wrap()` to get both SNPs and INDELs under the `DeepVariant` strip! – Matteo Apr 23 '23 at 07:54

0 Answers0