1

I work with R and ggplot. I have already drawn point for 4 different data.frames. And now I want to draw 4 regression lines for this points sets.

My previous code:

ggplot() + 
    ggtitle("title")+
    xlab("date") +
    ylab("value") +
    geom_point(data=toplot$1, aes(x=date, y=x, color='1'), size = 4, shape=1) +     
    geom_point(data=toplot$2, aes(x=date, y=x, color='2'), size = 4, shape=2)+      
    geom_point(data=toplot$3, aes(x=date, y=x, color='3'), size = 4, shape=3)+      
    geom_point(data=toplot$4, aes(x=date, y=x, color='4'), size = 4, shape=4)+          
    scale_colour_manual(name = "legend", values = c('1'='green', "2"="red", "3"="blue", "4"="brown"))

When I add a line for the first data.frame

geom_smooth(data=toplot$1, formula=date~x,method=lm, color='1',aes(x=date, y=x)) 

I receive a message:

Only one unique x value each group.Maybe you want aes(group = 1)

If I add line:

geom_smooth(data=toplot$1, formula=date~x,method=lm, color='1',aes(group=1)) 

I receive another message:

stat_smooth requires the following missing aesthetics: x, y

May be you know, what I need to write as aes argument (without any aes it also doesn't work).

Thank you.

user2957954
  • 1,221
  • 2
  • 18
  • 39
  • I don't see how you managed to get `aes(x = date, y = x)` to work when there is only one column of data. The more common way is to stack all the data in one column, and create an indicator column for group so you only have to use both geoms only once: `ggplot(data = blah, aes(x = blah, y = blah, group = factor(group_id))) + geom_point() + geom_smooth()` – Vlo Jul 29 '14 at 13:22
  • I have two columns of data: date and x (for value). – user2957954 Jul 29 '14 at 13:24
  • Not in the same data column. You seem to be using four columns of data defined from `toplot$1 to 4`. I don't quite understand how `toplot$1` contains two columns of data. – Vlo Jul 29 '14 at 13:26
  • I have 4 data.frames. And in the code above you can see 4 sets of points for 4 data.frames: toplot$1, toplot$2, topolt$3, toplot$4. Each data.frame contains 2 columns: date and x (for value). I've already got 4 sets of points on the same picture, I want to get 4 regression lines for them. What do I need to write in aes of geom_smooth? – user2957954 Jul 29 '14 at 13:31
  • `toplot$1` in base R calls one column in the data.frame `toplot`. I still don't your data structure, so it is difficult to know what the correct syntax for `geom_smooth` should be. For starters, `geom_smooth` doesn't require a formula paramter since it is a wrapped function for `stat_smooth` and dynamically generates `formula` based on your `aes`. Do a `dput(head(toplot, 10))` if you can. – Vlo Jul 29 '14 at 13:50
  • toplot is a list of data frames: 1,2,3 and 4. toplot$1 is a single data.frame with 2 columns: date and x. In my plot "date" is situated on the x-axis and "x" - on y-axis as value. Where do you see only one column? – user2957954 Jul 29 '14 at 13:59
  • When can you call items within a list with $ operator? Wouldn't it be `z["1"]`? I meant to say that the standard convention is to put all your date in one column, and x in one column, and have a third column for your 1/2/3/4 group variable. – Vlo Jul 29 '14 at 14:17

4 Answers4

2

The short answer is aes(x=date, y=x, color='1'), and the proper answer is that you should learn to use ggplot2. See below for an example of how to use grouping.

# prepare data
toplot2 <- do.call(rbind, toplot)
toplot2[, "group"] <- factor(rep(1:length(toplot), times=sapply(toplot, nrow)))
# ggplot command
ggplot(toplot2, aes(x=date, y=x, color=group, shape=group)) +
  geom_point(size=4) +
  geom_smooth(method=lm) +
  ggtitle("title")+
  xlab("date") +
  ylab("value") +
  scale_colour_manual(name = "legend", values = c('1'='green', '2'='red', '3'='blue', '4'='brown')) +
  scale_shape_manual(name = "legend", values = c('1'=1, '2'=2, '3'=3, '4'=4))

EDIT: It seems like you have to actually add aes(x=date, y=x, group=1, color='1') to your example or aes(x=date, y=x, color=group, shape=group, group=group) in my version. See e.g. Adding a simple lm trend line to a ggplot boxplot or Joining means on a boxplot with a line (ggplot2), although they also don't explain why.

Community
  • 1
  • 1
shadow
  • 21,823
  • 4
  • 63
  • 77
  • Thanks, it seems nice. But for this code with this geom_smooth function I still receive message: Only one unique x value each group.Maybe you want aes(group = 1). What does it mean? – user2957954 Jul 29 '14 at 14:50
  • 1
    Posting on stackoverflow is a method of learning – There Jan 29 '21 at 15:44
1

The short answer is probably something like:

geom_smooth(data=toplot$1, formula=y~x,method=lm, color='1',aes(x = date, y = x,group=1))

(The formula in geom_smooth is always "generic" in the sense that you reference a response and covariate with x and y, no matter what you called them in the other layers.)

However, this:

geom_point(data=toplot$1, aes(x=date, y=x, color='1'), size = 4, shape=1) +     
geom_point(data=toplot$2, aes(x=date, y=x, color='2'), size = 4, shape=2)+      
geom_point(data=toplot$3, aes(x=date, y=x, color='3'), size = 4, shape=3)+      
geom_point(data=toplot$4, aes(x=date, y=x, color='4'), size = 4, shape=4)

is not the right way to go. Basically anytime you find yourself repeating a single geom like this over and over again is a good sign that you're not using ggplot correctly.

Instead, one would generally combine all four data frames into a single one using rbind, and then create a third variable grp with values 1-4 to label each section. Then you'd just do a single layer:

geom_point(data = full_data,aes(x = date,y = x, color = grp, shape = grp), size = 4)
joran
  • 169,992
  • 32
  • 429
  • 468
  • `geom_smooth(data=toplot$1, formula=y~x,method=lm, color='1',aes(group=1)) ` doesn't work. the answer is: Only one unique x value each group.Maybe you want aes(group = 1) – user2957954 Jul 29 '14 at 14:26
  • 2
    @user2957954 I've given you as much help as I can without a reproducible example. – joran Jul 29 '14 at 14:28
0

The smoothing function, in this case "lm", uses variables defined in aes so you could use something like the following

toplot1 <- data.frame(date=seq(Sys.Date()-30,Sys.Date(),1), x= 1:31+ rnorm(31, 0, 2))
toplot2 <- data.frame(date=seq(Sys.Date()-30,Sys.Date(),1), x= 1:31+ 5 + rnorm(31, 0, 2))
 sp <- ggplot() + 
     ggtitle("title")+
     xlab("date") +
     ylab("value") +
     geom_point(data=toplot1, aes(x=date, y=x, color='1'), size = 4, shape=1) +     
     geom_point(data=toplot2, aes(x=date, y=x, color='2'), size = 4, shape=2)+      
     scale_colour_manual(name = "legend", values = c ('1'='green', "2"="red", "3"="blue", "4"="brown"))
 sp <- sp  +
      geom_smooth(data=toplot1, aes(x=date, y=x, color="1"), formula = y~x, method="lm") +
      geom_smooth(data=toplot2, aes(x=date, y=x, color="1"), formula = y~x, method="lm")
plot(sp)

I don't quite understand the structure of your dataframes with the names toplot$1, etc.

WaltS
  • 5,410
  • 2
  • 18
  • 24
0

my script run, only add group=1 x,y axis.

graf_disp <- ggplot(dados, aes(x=Treatments, y=Survival, group=1)) +
  geom_point(aes(col=Treatments)) +
  geom_smooth(method=lm, se=F) +
  labs(subtitle = "Weeds Survival Percentage",
       y = "Weeds Survival (%)", x = "Treatments")

plot(graf_disp)
Andre M
  • 11
  • 2