0

I am hoping to fine tune ggplot2 shape types in two overlay plots when coding by groups. Is this possible? I created a strip plot with points for mean observations, coded with two different shapes for two different factors of seeding rate, and three different colors for different fertilizer price scenarios. I included jitter observations of the values used to calculate these means, also coded by the same color and shape groups listed for the mean points. I included lines showing standard errors for each mean.
To improve visualization, I would like the jitter observations to be open shapes and the means to be closed shapes.

library(ggplot2)
p<-ggplot()+  geom_jitter(data=econ4,position=position_jitter(0.2),aes(x=location,y=profit,shape=seeding.rate,colour=urea.cost.USD,alpha=0.05))+theme_bw()+geom_hline(yintercept=0)+labs(y = "profit per ha (USD)")
p<-p+geom_point(data=econ4,aes(x=location,y=mean.profit,colour=urea.cost.USD,shape=seeding.rate))
p<-p+geom_linerange(data=econ4,aes(x=location,y=mean.profit,ymin=mean.profit-se.profit,ymax=mean.profit+se.profit,colour=urea.cost.USD)) 
#p<-p+scale_shape_manual(values=1:2)
p<-p+scale_colour_manual(values=c("#0072B2","#009E73", "#CC79A7"))
p<-p + coord_flip()
p<-p+scale_shape_manual(values=1:2) #this changes all shapes to open.  I would like this to apply only to the geom_jitter.

plot

stefan
  • 90,330
  • 6
  • 25
  • 51
LKK
  • 3
  • 1
  • Welcome to SO! Sure could this be achieved. But to help you any further we need [a minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) including a snippet of your data or some fake data to run your code and to figure out a solution. – stefan Apr 19 '23 at 20:21

1 Answers1

0

As you want different shapes for jittered and mean points mapping seeding.rate on shape isn't sufficient. Instead you have to "recode" seeding.rate to differntiate between jittered and mean points and to get four different shapes. To this end you could map e.g. paste0(seeding.rate, ".jitter") on the shape aes in geom_jitter and paste0(seeding.rate, ".mean") on the shape aes in geom_point. As a result we have four different categories to which we could assign your desired four different shapes.

Additionally I slightly refactored your code and instead of computing the mean and se manually I use stat="summary" to compute the statistics on the fly.

Using some fake random example data:

library(ggplot2)

set.seed(123)

econ4 <- data.frame(
  location = sample(LETTERS[1:5], 100, replace = TRUE),
  urea.cost.USD = sample(c("300", "550", "1000"), 100, replace = TRUE),
  seeding.rate = sample(c("Mirsky", "Poffenbarger"), 100, replace = TRUE),
  profit = rlnorm(100, 4) - 100
)

ggplot(econ4, aes(profit, location, colour = urea.cost.USD)) +
  geom_vline(xintercept = 0) +
  geom_jitter(
    position = position_jitter(0.2),
    aes(shape = paste0(seeding.rate, ".jitter")), alpha = .8,
    size = 2
  ) +
  geom_point(aes(shape = paste0(seeding.rate, ".mean")),
    stat = "summary", fun = mean, size = 2
  ) +
  geom_linerange(stat = "summary", fun.data = "mean_se") +
  scale_colour_manual(
    values = c("#0072B2", "#009E73", "#CC79A7")
  ) +
  scale_shape_manual(
    values = c(Mirsky.jitter = 16, Mirsky.mean = 21, Poffenbarger.jitter = 17, Poffenbarger.mean = 24),
    breaks = c("Mirsky.jitter", "Poffenbarger.jitter"),
    labels = c("Minsky", "Poffenbarger")
  ) +
  theme_bw() +
  labs(x = "profit per ha (USD)", shape = "seeding rate")

stefan
  • 90,330
  • 6
  • 25
  • 51
  • I am now noticing that the `geom_linerange(stat = "summary", fun.data = "mean_se")` only creates standard error lines for each of the three urea cost scenarios. I am hoping to create se lines for each of the combinations or seeding rate and urea cost. Is there a way to do this? – LKK Apr 21 '23 at 18:40
  • Sure could this be achieved. For example you could map `seeding.rate` on an aesthetic for the linerange too e.g. `geom_linerange(aes(linetype = seeding.rate), stat = "summary", fun.data = "mean_se")`. Or use the group aes, e.g. `aes(group = paste(seeding.rate, urea.cost.USD, sep = "_"))` . – stefan Apr 22 '23 at 08:40
  • 1
    Thank you. This worked well for me `geom_linerange(aes(group = paste(seeding.rate, urea.cost.USD, sep = "_")), stat = "summary", fun.data = "mean_se")` – LKK Apr 25 '23 at 13:43