I am trying to get boxplots for 4 different genes with the expression data for each gene across multiple patients.
I've tried multiple ways and just keep hitting errors. I can do it using the base boxplot() function, but can't figure it out in ggplot and I can't see anywhere to help - spent hours reading other answers and questions yesterday! Mostly all other data seems to be as 2 columns so can specify x = column a and y = column b. However, I want to plot all 4 columns of my entire df and I couldn't find any help with that. I can do one at a time in ggplot but not all 4 together.
The data I have, BCON_sig_genes, is 4 genes each with values between 3-6 for 152 samples. The df is 152 obs of 4 variables, where the 4 columns are headed each of the gene names and all the cells are values as shown below.
CD3E LAT ZAP70 LCK
1002 4.214679 5.652482 4.788204 5.393783
1022 4.424925 5.776641 4.864269 5.593587
8035 4.327270 5.725364 4.509920 4.961659
8037 4.415715 5.494048 4.435241 5.081846
9004 4.290078 5.265329 4.799106 5.275424
9005 4.233490 5.338098 4.666506 5.069394
The following code gets me one gene at a time, by substituting in the name of the gene.
BCON_sig_genes %>% ggplot(aes(y = CD3E, x = "CD3E"))+ geom_boxplot()
ggplot boxplot 1 gene only
I have tried gene <- colnames(BCON_sig_genes)
and then inputting x = gene but it doesn't work and comes up with the following error message:
Error: Aesthetics must be either length 1 or the same as the data (152): x
I think I need to sort out what y is. I tried leaving blank so it would take all the data and sort for each column but no luck.
I tried using a gather() function and making key and value but I couldn't quite figure it out without getting errors... but this felt like I was on the right track!
With the base function all I have to do it boxplot(BCON_sig_genes)
and it just plots all 4 genes on a graph with the correct values. base function boxplot all genes
I think I need to wrangle the data better for ggplot so I can tell it that y is just all the expression values for each column but I'm not sure how.
Any help would be much appreciated!!
Thanks, Vicky