Forest plot in R: Trying to plot more than three points

Question

this is my first time posting a question on here so please forgive me if my question is unclear or incomplete.

My scenario: I have a dataframe that has 21 meta-analytic distributions (Distribution1-Distribution21). For each distribution, I have 10 estimates of the respective meta-analytic mean effect size (ES1-ES10). Effectively, I have a meta-analytic mean effect size and nine other estimates of this mean from a variety of sensitivity analyses (i.e., outlier and publication bias analyses).

Using adapted code (can provide link if needed; I am not able to post multiple links because I am a new user), I am able to plot three estimates of each distribution's mean estimate. To give you an idea of what I'm talking about, imagine a figure that displays the mean estimate and it's confidence intervals.

Here is the dataframe and adapted code:

x   |   ES1 |   ES2 |   ES3 |   ES4 |   ES5 |   ES6 |   ES7 |   ES8 |   ES9 |   ES10
Distribution1   |   -0.07   |   -0.07   |   -0.06   |   -0.07   |   -0.02   |   -0.03   |   -0.09   |   -0.07   |   0.00    |   0.01
Distribution2   |   -0.06   |   -0.06   |   -0.04   |   -0.05   |   -0.04   |   -0.05   |   -0.07   |   -0.06   |   -0.03   |   0.01
Distribution3   |   -0.08   |   -0.09   |   -0.07   |   -0.08   |   -0.01   |   -0.08   |   -0.10   |   -0.08   |   -0.01   |   0.01
Distribution4   |   -0.10   |   -0.11   |   -0.10   |   -0.09   |   -0.05   |   -0.07   |   -0.11   |   -0.10   |   -0.06   |   0.010
Distribution5   |   -0.08   |   -0.08   |   -0.06   |   -0.08   |   -0.02   |   -0.03   |   -0.10   |   -0.08   |   0.00    |   0.02
Distribution6   |   -0.09   |   -0.10   |   -0.08   |   -0.09   |   -0.03   |   -0.08   |   -0.11   |   -0.09   |   -0.03   |   0.02
Distribution7   |   -0.11   |   -0.13   |   -0.10   |   -0.11   |   -0.04   |   -0.04   |   -0.12   |   -0.11   |   -0.08   |   0.01
Distribution8   |   -0.10   |   -0.14   |   -0.06   |   -0.10   |   -0.01   |   -0.08   |   -0.13   |   -0.10   |   -0.06   |   0.04
Distribution9   |   -0.04   |   -0.05   |   -0.02   |   -0.04   |   0.00    |   -0.04   |   -0.06   |   -0.04   |   -0.06   |   0.00
Distribution10  |   -0.11   |   -0.12   |   -0.09   |   -0.11   |   -0.03   |   -0.09   |   -0.12   |   -0.11   |   -0.11   |   0.00
Distribution11  |   -0.06   |   -0.09   |   -0.04   |   -0.06   |   -0.01   |   -0.01   |   -0.09   |   -0.06   |   -0.01   |   0.04
Distribution12  |   -0.11   |   -0.11   |   -0.09   |   -0.11   |   -0.09   |   -0.10   |   -0.12   |   -0.11   |   -0.08   |   -0.03
Distribution13  |   -0.19   |   -0.22   |   -0.16   |   -0.19   |   -0.08   |   -0.17   |   -0.21   |   -0.19   |   -0.13   |   -0.01
Distribution14  |   -0.01   |   -0.02   |   0.00    |   -0.01   |   0.00    |   0.00    |   -0.03   |   -0.01   |   -0.02   |   -0.01
Distribution15  |   -0.19   |   -0.22   |   -0.16   |   -0.19   |   -0.08   |   -0.17   |   -0.21   |   -0.19   |   -0.13   |   -0.01
Distribution16  |   -0.09   |   -0.1    |   -0.08   |   -0.09   |   -0.01   |   -0.09   |   -0.11   |   -0.09   |   -0.07   |   0.00
Distribution17  |   -0.16   |   -0.19   |   -0.14   |   -0.16   |   -0.07   |   -0.12   |   -0.18   |   -0.16   |   -0.10   |   0.00
Distribution18  |   -0.05   |   -0.06   |   -0.03   |   -0.05   |   -0.02   |   -0.02   |   -0.05   |   -0.05   |   -0.02   |   0.01
Distribution19  |   -0.09   |   -0.10   |   -0.08   |   -0.09   |   -0.01   |   -0.08   |   -0.11   |   -0.09   |   -0.06   |   0.01
Distribution20  |   -0.02   |   -0.03   |   -0.01   |   -0.02   |   0.01    |   0.00    |   -0.04   |   -0.02   |   0.00    |   0.02
Distribution21  |   -0.1    |   -0.12   |   -0.09   |   -0.1    |   -0.02   |   -0.08   |   -0.12   |   -0.1    |   -0.04   |   0.02

#My APA-format theme
#https://gist.github.com/akshaycuhk/01576c57149a9a3d14514c9a3c4b4b1d

install.packages("ggplot2")
library(ggplot2)

apatheme=theme_bw()+ 
theme(panel.grid.major=element_blank(),
panel.grid.minor=element_blank(),
panel.border=element_blank(),
axis.line=element_line(),
text=element_text(family='Times'),
legend.position='bottom', axis.text=element_text(size=14),
axis.title=element_text(size=14,face="bold"))

credplot.gg <- function(d){
# d is a data frame with 4 columns
# d$x gives variable names
# d$y gives center point
# d$ylo gives lower limits
# d$yhi gives upper limits
require(ggplot2)
p <- ggplot(d, aes(x=x, y=ES1, ymin=ES2, ymax=ES3))+
geom_pointrange()+
geom_hline(yintercept = 0, linetype=2)+
coord_flip()+
xlab('Distribution')+
ylab('Effect size')
return(p)
}

# load your data below 
d <- read.table(file.choose(), sep=",", header=TRUE)
Fig1 <-credplot.gg(d) +xlim("Distribution1", 
"Distribution2", 
"Distribution3", 
"Distribution4", 
"Distribution5",
"Distribution6", 
"Distribution7", 
"Distribution8", 
"Distribution9", 
"Distribution10", 
"Distribution11", 
"Distribution12", 
"Distribution13",
"Distribution14", 
"Distribution15", 
"Distribution16", 
"Distribution17", 
"Distribution18", 
"Distribution19",
"Distribution20",
"Distribution21")
Fig1

I am not yet able to embed images so here is a link to the data file, script, and figure: https://www.dropbox.com/sh/aczv1dw5mjmone8/AACqekiFVdJqeA1cRvIvs7NFa?dl=0

My question: Is there a way for me to increase the number of point estimates from three to ten? Specifically, can I plot all ten estimates (ES1 -> ES10) for all 21 distributions (Distribution1 -> Distribution21)? Ideally, each point estimate would have its own shape/marker on the line to distinguish it from the others and a legend would accompany the figure.

Thanks to anyone who is willing to help me :)

I'm not entirely sure what you want your output to look like, but it sounds like it could be an issue of having a dataset in wide format that would be easier to plot if it were in a long format. See examples such as [this](http://stackoverflow.com/questions/9531904/plot-multiple-columns-on-the-same-graph-in-r) and [this](http://stackoverflow.com/questions/12331597/plotting-multiple-columns-with-ggplot2). — aosmith, Sep 06 '16 at 21:34

score 0 · Accepted Answer · answered Sep 06 '16 at 21:47

0

Is this what you are trying for? It involves reshaping your dataset into long format, adding points with different shapes per "E" category and then drawing lines through the points for each "Distribution" to emulate a forest plot.

library(reshape2)
dat2 = melt(d, id.vars = "x")

# Set x factor order in order that appears in data
dat2$x = factor(dat2$x, levels = unique(dat2$x))

ggplot(dat2, aes(x=x, y= value))+
    geom_point(aes(shape = variable)) +
    geom_line() +
    scale_shape_manual(values = 0:9) +
    geom_hline(yintercept = 0, linetype=2) +
    coord_flip() +
    xlab('Distribution') +
    ylab('Effect size')

Note that things get ugly fast when using this many shapes. See here for some shape options.

answered Sep 06 '16 at 21:47

aosmith

34,856
9
84
118

Hi @aosmith: is there a way for ES1->ES5 and ES6->ES10 to share the same shapes but have different colors. Specifically, could ES1 and ES6 have the same shape but one be blue and the other be red? Then, could ES2 and ES7, ES3 and ES8, and so on follow this pattern? – James F Sep 26 '16 at 03:11
@JamesF You can set values for groups in ggplot2 using the appropriate `scale_*_manual` function. In your case it sounds like you'd use the same shape for the pairs you listed via `values` in `scale_shape_manual`; see the help page for a few simple examples. You could do the same for the colors. If you want a single legend the code will be little more complicated, but there are quite a few examples around stack overflow on how to do this. – aosmith Sep 26 '16 at 17:50

Forest plot in R: Trying to plot more than three points

1 Answers1