From a mouse experiment I have data for about fifty mice coming for about 15 different metrics. I generated a list of correlation plots of every metric against every other metric to identify which measurements correlate with each other and which ones don't.
library(ggplot2)
df <- structure(list(mouse_ID = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L,
10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 22L, 23L, 24L, 25L,
26L, 27L, 28L, 29L, 30L, 31L, 32L, 33L, 34L, 35L, 36L, 37L, 38L,
39L, 40L, 41L, 42L, 43L, 44L, 45L, 46L, 47L, 48L, 49L, 50L, 51L,
52L, 53L, 54L, 55L), treatment = structure(c(1L, 1L, 1L, 1L,
1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L
), .Label = c("not challenged", "vehicle control", "high",
"medium", "low", "reference"
), class = "factor"), value.x = c(0.003725, 0.0208, 0.004475,
0, 0.00895, 1.00625, 1.0125, 1.014, 1.1025, 0.925, 0.897, 0.99,
1.1495, 1.0125, 1.08, 0.88425, 1.001, 0.864, 0.89175, 0.9425,
0.943, 1.07325, 0.73575, 0.606, 0.682, 0.79925, 0.87, 0.60225,
0.756, 0.891, 0.6555, 0.572, 0.253, 0.255, 0.396, 0.4495, 0.299,
0.39, 0.3, 0.5365, 0.378, 0.475, 0.73575, 0.4895, 0.468, 0.90625,
0.3905, 0.4995, 0.60375, 0.744, 0.75, 0.5535), value.y = c(0,
0, 0, 0, 0, 5.775, 4.6875, 4.992, 7.245, 6.0125, 3.795, 4.99125,
7.26275, 4.35375, 4.3875, 3.6025, 4.389, 3.852, 3.444, 4.205,
5.207, 4.77, 3.052, 2.65125, 2.024, 3.6835, 2.9, 1.5695, 2.7,
2.619, 2.964, 1.936, 0.539, 0.408, 1.056, 1.085, 0.897, 0.795,
0.5, 1.0915, 0.5355, 0.575, 2.8885, 2.0915, 1.755, 3.40625, 1.42,
1.6095, 2.835, 2.3715, 2.7, 1.927)), row.names = c(NA, -52L),
class = c("tbl_df", "tbl", "data.frame"))
ggplot(data = df, aes(x = value.x, y = value.y)) +
geom_point(aes(color = treatment)) +
geom_smooth(method = lm, se = TRUE)
#> `geom_smooth()` using formula 'y ~ x'
It turns out that a long list of over 100 plots is really hard to take in, and on each plot there is relatively little information. I would like to arrange these linear plots in a grid of the 15 x 15 measurements and visualize the correlation coefficient for the linear models by background color and overlay the linear model and data points.
Is this somehow feasible to do in ggplot? Is there another tool I could use? And if so, how should I arrange the data structure? I am comfortable dealing with purrr and nested lists for such models, but I guess in this case a long list does not seem ideal -- a matrix-style arrangement would fit the output much better.
Any thoughts or suggestions on how to approach this?
Created on 2021-01-20 by the reprex package (v0.3.0)
Sorry, my explanation wasn’t clear. The data I am showing above is only a fraction of the data available. Here I am plotting the linear correlation of two read outs. But I have over a dozen read outs that I used for pair wise comparisons. I am looking for something like this:
Each tile should be colored by a metric of the linear model (eg correlation coefficient or p value) but it should also show the graphed data and overlay of the linear model.