2

I am trying to plot a line graph in ggplot. But I am geeting this error:

Aesthetics must be either length 1 or the same as the data (9): y, x, group 

This graph contains 4 lines. I have one more graph which makes use of same data frame but different two columns. I dont understand why that graph is working properly but this graph is not. I tried all possible answers I found.But nothing is working. Other graph is made using expkm and actualkm with dates at x axis.

pred <- ggplot(data_, aes(x= data_$dates, group=1)) +
    geom_point(aes(y = data_$exp))+
    geom_point(aes(y = data_$facc))+
    geom_point(aes(y = data_$cntrmlg))+
    geom_point(aes(y = data_$top10rem))+
    geom_line(aes(y = data_$exp, color='Expected')) + 
    geom_line(aes(y = data_$facc, color='Actual'))+
    geom_line(aes(y = data_$cntrmlg, color='status'))+
    geom_line(aes(y = data_$top10rem, color='Statusy'))+
    geom_label(aes(y = data_$exp,label = data_$exp,hjust = 0,vjust = -0.2))+
    geom_label(aes(y = data_$facc,label = data_$facc,hjust = 0,vjust = 0.2 ))+
    geom_label(aes(y = data_$cntrmlg,label = data_$cntrmlg,hjust = 0,vjust = -0.2))+
    geom_label(aes(y = data_$top10rem,label = data_$top10rem,hjust = 0,vjust = 0.2 ))+
    labs(title = "Reli")+
    labs(x="Dates")+
    labs(y="")+
    guides(color = guide_legend(title = ""))

Sample data :

     expkm
    50000
    100000
    112500
    137500
    150000
    162500
    187500
    187500
    187500

   actualkm dates  exp      facc        cntrmlg     top10rem
    26013   Dec-17  32660   26013       50000       26013
    56796   Jan-18  46188   13802       75000       41405
    52689   Feb-18  56569   19357       87500       45166
    64657   Mar-18  65320   25019       100000      50039
    79445   Apr-18  73030   21508       91667       46600
    92647   May-18  80000   19592       101786      53178
    121944  Jun-18  86410   16473       75000       41183
    125909  Jul-18  92376   15900       77679       44293
    106470  Aug-18  97980   15795       67105       38241
qwww
  • 1,313
  • 4
  • 22
  • 48
  • I'd recommend taking a step back to first go through some `ggplot2` tutorials. There are 2 major patterns in `ggplot` that are missing in your code: assigning variables to aesthetics, such as color, and the fact that you're accessing columns of your data frame throughout all you `geom_*`s and other functions. Because of that, you don't need `$`, and will actually cause yourself [problems](https://stackoverflow.com/q/32543340/5325862) by doing so – camille Oct 31 '18 at 13:46
  • There are several examples in the r-faq tag, including [this one](https://stackoverflow.com/q/3777174/5325862), on reshaping data to get it into the long format that fits `ggplot`'s "grammar of graphics" paradigm. I'd recommend the Tidy Data and Data Visualization chapters of the free [R for Data Science book](https://r4ds.had.co.nz/) – camille Oct 31 '18 at 13:49

1 Answers1

4

With ggplot you need to use a different approach in order to correctly plot.

Refer to this to understand better the grammar. Here another useful guide.

You don't need to call each new line, but instead you call it once, and specify the grouping by the color aesthetic.

Note in my code the use of gather, in order to get the data in a long format:

library(ggplot2)
library(tidyr) # for the gather function
data %>% 
  gather("key", "value", -dates) %>% 
  ggplot(aes(x = dates, y = value, color = key)) +
  geom_line()

enter image description here

Here the complete code following your example:

data %>% 
  gather("key", "value", -dates) %>% 
  ggplot(aes(x = dates, y = value, color = key)) +
  geom_line() +
  geom_point() +
  geom_label(aes(y = value, label=key), hjust = 0, vjust = -0.2) +
  labs(title = "Reli")+
  labs(x="Dates")+
  labs(y="")+
  guides(color = guide_legend(title = ""))

enter image description here

Data used:

tt <- "expkm actualkm dates  exp      facc        cntrmlg     top10rem
50000 26013   Dec-17  32660   26013       50000       26013
100000 56796   Jan-18  46188   13802       75000       41405
112500 52689   Feb-18  56569   19357       87500       45166
137500 64657   Mar-18  65320   25019       100000      50039
150000 79445   Apr-18  73030   21508       91667       46600
162500 92647   May-18  80000   19592       101786      53178
187500 121944  Jun-18  86410   16473       75000       41183
187500 125909  Jul-18  92376   15900       77679       44293
187500 106470  Aug-18  97980   15795       67105       38241"

data <- read.table(text=tt, header = T, stringsAsFactors = F)
data$dates <- lubridate::parse_date_time(data$dates, "my") # correct date format
RLave
  • 8,144
  • 3
  • 21
  • 37
  • 1
    data or data_?? I am getting an error could not find function "gather", I have added dplyr library – qwww Oct 31 '18 at 12:05
  • 1
    gather() is in library(tidyverse) and library(tidyr). [In RStudio you can type a function name, put the cursor on the name, and press F1 to search for the relevant library.] – M.Viking Oct 31 '18 at 12:10
  • 1
    sorry about that, fixed. And it's `data`. – RLave Oct 31 '18 at 12:39
  • 2
    Great answer. As a general rule, when you're adding lots of additional geoms for a failry simple graph, you probably need to do more in the `aes()` of the actual `ggplot` call or adjust the underlying data. – Ben G Oct 31 '18 at 12:50