I've never made a plot like this before, so sorry as this is probably a basic question, but I am stuck on how to make a chord diagram and specifically get the outer sections to be my column headings (drug mechanisms) and the inner connections between the sections to be the rows (genes) which don't need to be named in the plot as there are so many.
My data is rows of genes that are marked as interacting with columns of drug mechanisms by zeros or ones.
For example a subset of my data looks like:
Gene Diuretic Beta_blocker ACE_inhibitor
Gene1 1 0 0
Gene2 0 0 1
Gene3 1 1 1
Gene4 0 1 1
My total data is actually 700 genes for 15 columns of drug mechanisms with all zeors and ones. I am currently just creating a chord diagram with:
df <- fread('df.csv')
df[is.na(df)] <- 0
df <- df %>% data.frame %>% set_rownames(.$Gene) %>% dplyr::select(-Gene)
mt <- as.matrix(df)
circos.par(gap.degree = 0.9) #set this as I was otherwise getting an error with my total data
chordDiagram(mt, transparency = 0.5)
With my total data this plot looks like:
I've been getting various errors with trying to get this plot to be 15 sections only (and even just trying to get the sections to have the column names).
Is there a way for me plot a chord diagram with the sections being representative of each column? Then for genes/rows that have an interaction (a 1 in the data) for that section and any other section to be shown in the chord diagram? I don't need the gene names to be visible, I am looking to just visualize the amount of overlap between my columns/sections.
Example input data (for which my problem would be trying to make only have 3 sections per each column to show their overlap):
df <- structure(list(Gene = c("Gene1", "Gene2", "Gene3", "Gene4"),
Diuretic = c(1L, 0L, 1L, 0L), Beta_blocker = c(0L, 0L, 1L,
1L), ACE_inhibitor = c(0L, 1L, 1L, 1L)), row.names = c(NA,
-4L), class = c("data.table", "data.frame")