0

I have several files which I am looping via lapply. Within lapply, I want to make for each file, 1 MAPLOT and 1 Volcano plot. Here is my code :

library(ggplot2)
library(dplyr)

files <- list.files(path = baseDir,pattern = "*.txt",full.names = T,recursive = F)

fun <- function(x){
  a <- basename(x)
  a <- gsub(".txt","",a)
  
  df <- read.table(x,header = TRUE,sep = "\t")
  df <- df[,c(1,(ncol(df)-5):(ncol(df)))]
  df <- mutate(df,threshold = ifelse(padj < 0.05,"sig","non-sig"))
  df$sigtype <- paste(df$threshold,df$type,sep="-")
  
  ## Make MAPLOT
  ggplot(df, aes(x = baseMean, y = log2FoldChange)) +
    scale_x_continuous(trans = "log10")+
    geom_point(aes(col = threshold), size = 1, shape = 20)+
    scale_color_manual(values = c("non-sig" = "gray70","sig" = "red")) +
    ylim(-5, 10)+geom_hline(yintercept = 0, linetype = "dashed",color = "black") +
    xlab("mean of normalized counts")+ theme_classic()
  
  ## Make Volcano plot
  ggplot(df, aes(x = log2FoldChange, y = -log10(padj))) +
    scale_x_continuous()+ geom_point()+ xlab("fold change")+ theme_classic()
}

lapply(files,fun)

Running this under Rstudio, I expected it to generate 1 MAPLOT and 1 Volcano plot for every input file present in the character vector files(there are 8 input text files). But instead, it only plotted the 8 volcano plot.

What am I missing here?

Also I want to output all the volcano plots in 1 pdf(volcano.pdf), 1 per page and all the MAPLOT's in another pdf(maplot.pdf), 1 per page. How can I achieve this?

user3138373
  • 519
  • 1
  • 6
  • 18
  • 2
    `ggplot` doesn't actually plot anything. `print.ggplot` does. Wrap `ggplot` calls inside of loops, functions, etc. in `print()`. – Mikael Jagan Feb 11 '22 at 06:31
  • `print()` works great. Is it possible if all the volcano plots are outputted in 1 pdf(1 per page) and all the MAPLOT are outtputted on another pdf(1 per page)? – user3138373 Feb 11 '22 at 06:35
  • Yes, there are tricks - see `?dev.set`. – Mikael Jagan Feb 11 '22 at 06:44
  • Can't find a reproducible example that will help me output them separately – user3138373 Feb 11 '22 at 07:09
  • A better structure would be one `lapply` to read the files, another to produce the MAPLOTs and a third to produce the volcano plots. That, inherently, will help you produce the separate outputs you want. As general rules, (1) separate ingestion from wrangling from output and (2) each task in a separate function. That aids reprodicibility and reuse. – Limey Feb 11 '22 at 08:58
  • Can you provide a pseudocode for it? – user3138373 Feb 11 '22 at 14:57

1 Answers1

1

Here is a mostly didactic example with dev.set, which I have run in a new R process without any open graphics devices:

## Report active device
dev.cur()
## null device 
##           1

## Open device 2 and make it the active device
pdf("bar.pdf")
## Open device 3 and make it the active device
pdf("foo.pdf")

## List all open devices
dev.list()
## pdf pdf 
##   2   3

f <- function() {
    ## Plot in device 3
    plot(1:10, 1:10, main = "foo")
    ## Cycle to device 2
    dev.set()
    ## Plot in device 2
    plot(rnorm(10L), rnorm(10L), main = "bar")
    ## Cycle to device 3
    dev.set()
    invisible(NULL)
}

## Call 'f' four times
replicate(4L, f()) 

## Close device 3 and report active device
dev.off()
## pdf 
##   2

## Close device 2 and report active device
dev.off()
## null device 
##           1 

## Clean up
unlink(c("foo.pdf", "bar.pdf"))

@Limey's suggestion is to work in one device at a time to avoid the bookkeeping required by dev.set:

pdf("foo.pdf")
f1 <- function() {
    plot(1:10, 1:10, main = "foo")
    invisible(NULL)
}
replicate(4L, f1())
dev.off()

pdf("bar.pdf")
f2 <- function() {
    plot(rnorm(10L), rnorm(10L), main = "bar")
    invisible(NULL)
}
replicate(4L, f2())
dev.off()

unlink(c("foo.pdf", "bar.pdf"))
Mikael Jagan
  • 9,012
  • 2
  • 17
  • 48
  • the follow up post is here https://stackoverflow.com/questions/71086870/lapply-on-list-of-data-frames. thanks for helping me – user3138373 Feb 11 '22 at 22:04