0

I am trying to create a new ggplot with each loop iteration. I have read previous post and to utilize print() to create each new plot. However, I keep getting this error: Error in FUN(X[[i]], ...) : object 'logFC' not found. When I drop the print() in the script. It works but produces a series of empty plots. What am I doing wrong?

## Libraries to load
# Assumes dependencies are installed via install.packages() or BiocManager::install()
library("ggplot")
library(ggrepel)

## set path 
setwd("Path")
## Path to CSV file
# Needs to be modified for each pj
# Can I make more dynamic??
filenames = list.files(path = "Path", pattern="*.csv", full.names = FALSE)

for (i in filenames) { 
    CSVFile <-read.csv(i, na.strings=c("","NA"), header=TRUE)

    # The significantly differentially expressed genes are the ones found in the upper-left and upper-right corners.
    # Add a column to the data frame to specify if they are UP- or DOWN- regulated (log2FoldChange respectively positive or     negative)
    # add a column of NAs
    CSVFile$diffexpressed <- "NO"
    # if log2Foldchange > 0.6 and pvalue < 0.05, set as "UP" 
    CSVFile$diffexpressed[CSVFile$logFC > 0.6 & CSVFile$FDR < 0.05] <- "UP"
    # if log2Foldchange < -0.6 and pvalue < 0.05, set as "DOWN"
    CSVFile$diffexpressed[CSVFile$logFC < -0.6 & CSVFile$FDR < 0.05] <- "DOWN"

    # Re-plot but this time color the points with "diffexpressed"
    p <- ggplot(data=CSVFile, aes(x=logFC, y=-log10(FDR), col=diffexpressed)) + geom_point() + theme_minimal()

    # Add lines as before...
    p2 <- p + geom_vline(xintercept=c(-0.6, 0.6), col="red") +
          geom_hline(yintercept=-log10(0.05), col="red")

    # Now write down the name of genes beside the points...
    # Create a new column "delabel" to de, that will contain the name of genes differentially expressed (NA in case they are     not)
    CSVFile$delabel <- NA
    CSVFile$delabel[CSVFile$diffexpressed != "NO"] <- CSVFile$genes[CSVFile$diffexpressed != "NO"]

    # Finally, we can organize the labels nicely using the "ggrepel" package and the geom_text_repel() function
    # plot adding up all layers we have seen so far
    png(file = paste("Path_Out", i, ".png"), width = 6, height = 6, units = 'in', res=600)
    p3<-ggplot(data=CSVFile, aes(x=logFC, y=-log10(FDR), col=diffexpressed, label=delabel)) +
         geom_point() + 
         theme_minimal() +
         geom_text_repel() +
         scale_color_manual(values=c("blue", "black", "red")) +
         geom_vline(xintercept=c(-1.5, 1.5), col="red") +
         geom_hline(yintercept=-log10(0.05), col="red")
    print(p3)
    dev.off()
}
graphics.off()
Genetics
  • 279
  • 2
  • 11
  • 1
    Are you sure there is a column in your data name `logFC`? The error message indicates that's probably not the case. It's hard to help further without a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – MrFlick May 23 '22 at 16:52
  • 3
    `library("ggplot")` is wrong, it's `ggplot2`. – Rui Barradas May 23 '22 at 16:53
  • Hi Mr.FLick. Good call, data I was working on was named logFC, new dataset is logFC.something. Is there a way to plot this data successfully if logFC is present in column name? – Genetics May 23 '22 at 17:34
  • Try troubleshooting by running your code outside of the loop. Once everything is working as expecting, re-incorporate it back into the loop. – Skaqqs May 23 '22 at 18:37
  • Try calling `logFC` using the `$` operator, so something like `aes(x=CSVFile$logFC, ...` in order to troubleshoot the issue – grapestory May 23 '22 at 19:35

0 Answers0