(Fig. 3a, b, Extended Data Fig. 3a, b and Supplementary Table 1). After 48 h, more than one-third of the transcriptome was differentially expressed (>5,000 genes; 405 genes encoding for proteins in the extracellular region, Gene Ontology (GO) accession 0005576), significantly overlapping with the gene expression changes of A375 tumours in vivo after 5 days of vemurafenib treatment (Fig. 3a, b and Extended Data Fig. 3c). Similar extensive gene expression changes were observed in Colo800 and UACC62 melanoma cells treated with vemurafenib and H3122 lung adenocarcinoma cells treated with crizotinib (Extended Data Fig. 3d). Despite different cell lineages, different oncogenic drivers, and different targeted therapies we observed a significant overlap between the secretome of melanoma and lung adenocarcinoma cells (P < 9.11 × 10−5)
The original paper
I would like to see similar to the figure f where it shows the intersection and significance overlap. To achieve that i got this code working till the intersection part but I dont know how to run the significance part.
library(reshape2)
library(venneuler)
RNA_seq_cds <- read.csv("~/Downloads/RNA_seq_gene_set.txt", header=TRUE, sep="\t")
head(RNA_seq_cds)
ATAC_seq <- read.csv("~/Downloads/ATAC_seq_gene_set.txt", header=TRUE, sep="\t")
head(ATAC_seq)
RNA_seq <- RNA_seq_cds
ATAC_seq <- ATAC_seq
#https://stackoverflow.com/questions/6988184/combining-two-data-frames-of-different-lengths
cbindPad <- function(...) {
args <- list(...)
n <- sapply(args, nrow)
mx <- max(n)
pad <- function(x, mx) {
if (nrow(x) < mx) {
nms <- colnames(x)
padTemp <- matrix(NA, mx - nrow(x), ncol(x))
colnames(padTemp) <- nms
if (ncol(x) == 0) {
return(padTemp)
} else {
return(rbind(x, padTemp))
}
} else {
return(x)
}
}
rs <- lapply(args, pad, mx)
return(do.call(cbind, rs))
}
dat <- cbindPad(ATAC_seq, RNA_seq)
vennfun <- function(x) {
x$id <- seq(1, nrow(x)) #add a column of numbers (required for melt)
xm <- melt(x, id.vars="id", na.rm=TRUE) #melt table into two columns (value & variable)
xc <- dcast(xm, value~variable, fun.aggregate=length) #remove NA's, list presence/absence of each value for each variable (1 or 0)
rownames(xc) <- xc$value #value column=rownames (required for Venneuler)
xc$value <- NULL #remove redundent value column
xc #output the new dataframe
}
#https://stackoverflow.com/questions/9121956/legend-venn-diagram-in-venneuler
VennDat <- vennfun(dat)
genes.venn <- venneuler(VennDat)
genes.venn$labels <- c("RNA", "\nATAC" )
# plot(genes.venn, cex =15, )
#https://stackoverflow.com/questions/30225151/how-to-create-venn-diagram-in-r-studio-from-group-of-three-frequency-column
#https://rstudio-pubs-static.s3.amazonaws.com/13301_6641d73cfac741a59c0a851feb99e98b.html
vd <- venneuler(VennDat)
vd$labels <- paste(genes.venn$labels, colSums(VennDat))
plot(vd, cex=10)
text(.3, .45,
bquote(bold("Common ="~.(as.character(sum(rowSums(VennDat) == 2))))),
col="red", cex=1)
LABS <- vd$labels
The above code gives me the intersection plot
Now the significance part how do i do that between two gene sets and show it as shown in the original figure.
My data which i have used to generate the above plot
Any suggestion or help would be really appreciated.