I am using ggplot
and gridExtra
to make two plots side by side with different data, and I'm observing an unexpected behavior when using vector to make the plot instead of a dataframe.
Here is MWE with my problem:
library(ggplot2)
library(dplyr)
library(gridExtra)
cases <- c(1, 2)
df <- data.frame(
case=cases,
y1=c(1, 2),
y2=c(2, 4),
y3=c(3, 8),
y4=c(4, 16),
y5=c(5, 32)
)
x <- c(1, 2, 3, 4, 5)
plot_list <- list()
for(caso in cases){
data <- df %>% filter(case == caso)
y <- data %>% dplyr::select(starts_with('y')) %>% unlist(use.name=FALSE)
dd <- data.frame(xdf=x, ydf=y)
graph <- (
ggplot()
+ geom_line(data=dd, aes(x=xdf, y=ydf))
## + geom_point(data=dd, aes(x=xdf, y=ydf)) # this line works
+ geom_point(aes(x=x, y=y)) # this line doesn't
)
plot_list[[length(plot_list)+1]] <- graph
}
grid.arrange(grobs=plot_list, ncol=2)
This code makes a plot with a line on the left and a parabola on the right. I marked two lines that call geom_point
. If I use the line with the dataframe, everything works as expected. However, if I use the line with the vectors (that were actually used to create the dataframe), than the points of the parabola are plotted in all the graphs.
Here is the resulting figure:
Clearly, the problem is solved by using dataframes instead of vectors, but I wanted to understand why this behavior is happening in the first place. So I'd appreciate any insight of why R is behaving in such seemingly counter-intuitive (at least for me) way.