1

I have plotted a bar graph already and now I'd like to add a curve,going through the top point of each bar so that the trend of change can be sown more clearly.

The data frame is in a format like:

v1              v2
a               10
b               6
c               7
...

Here is the code I plot the bar:

ggplot(date_count, aes(V1,V2)) + geom_bar(stat = "identity")+  theme(axis.text.x  = element_text(angle=45, hjust = 1,vjust = 1)) +xlab("date") + ylab("Number of activity")

I have tried +geom_line() and geom_smooth() but both failed. Do you have any idea? Thanks in advance.

user5779223
  • 1,460
  • 3
  • 21
  • 42
  • 1
    would have been good if you would have produce your example with a public data set like iris. – CAFEBABE Jan 23 '16 at 13:15
  • In addition, a bar chart normally has a bottom point of zero. As CAFEBABE suggested, please provide a (minimal reproducible example)[http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example] or what you tried. – lukeA Jan 23 '16 at 13:27
  • 3
    Thanks. A reproducible example should begin with loading the required packages and then include (part of) the data set (e.g. `date_count <- structure(.......` or `date_count <- read.table(......`. But anyway. What if you add `+ geom_line(aes(group=1), colour="red") + geom_point(color="red", size=5)` to your plot code? Still not sure what your desired result should look like. – lukeA Jan 23 '16 at 13:33
  • @lukeA Thanks for reply. Yes it works! – user5779223 Jan 23 '16 at 16:39

2 Answers2

3

It is assumed you mean tops of bars rather than bottoms since the bottoms are all zero. We make the X axis continuous rather than discrete and in order to be able to see the added lines we make the bars white.

# input data in reproducible form   
Lines <- "V1 V2
a               10
b               6
c               7"
date_count  <- read.table(text = Lines, header = TRUE)

library(ggplot2)

n <- nrow(date_count)

ggplot(date_count, aes(x = 1:n, y = V2)) + 
    geom_bar(stat = "identity", fill = "white") +  
    theme(axis.text.x  = element_text(angle=45, hjust = 1, vjust = 1)) +
    xlab("date") + 
    ylab("Number of activity") +
    scale_x_continuous(breaks = 1:n, labels = date_count$V1) +
    geom_line() + 
    geom_smooth(lty = 2)

screenshot

G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
  • Isn't this assuming that each V1 value occurs only once? However, using `as.numeric` in the `aes` and and `levels` in `scale_x_countinouse` this can easily be fixed. – CAFEBABE Jan 23 '16 at 14:52
  • That each occurs once is an assumption of the question because (1) the data example shows this and (2) it is using `stat = "identity"`. – G. Grothendieck Jan 23 '16 at 15:01
  • The former is indeed an indication. However, the latter is imho not an argument. The "identity" mapping has it's justification also when a value occurs multiple times. You can see the difference, when you add `a 5` and compare the results – CAFEBABE Jan 23 '16 at 18:17
  • It is extremely unlikely that anyone would create a bar chart with two bars having the same label. – G. Grothendieck Jan 23 '16 at 22:02
  • Which is however what your code is doing. The normal behavior would be to sum (identity) or count. However, you separated the values by giving each line in the original dataset a unique number. In turn yor code is working differently from the TO ones if you use standard mtcars dataset, which has also been used in the ggplot documentation. Just try to reproduce the example from there having lines at the top. – CAFEBABE Jan 23 '16 at 22:28
  • `stat="identity"` is used when the data frame already has the summary statistics which is the case here. See the `geom_bar` documentation for more information: http://docs.ggplot2.org/0.9.3.1/geom_bar.html – G. Grothendieck Jan 23 '16 at 22:40
2

I'm a little confused by your "bottom point". I'm assuming that you mean the minimal point of each group.

It would be easier to reproduce with a larger sample of data. Hence, I'm using mtcars.

I interprete the "bottom" as minimal points which are here

  aggregate(mpg ~ cyl , mtcars, function(x)min(x))
      cyl  mpg
    1   4 21.4
    2   6 17.8
    3   8 10.4

You can generate the plot in the following way:

data(mtcars)
ggplot(mtcars, aes(x=cyl,y=mpg))+
      geom_bar(stat="identity")+ 
      stat_summary(fun.y=min ,geom="line",color="red")+
       stat_summary(fun.y=sum ,geom="line",color="blue")

![enter image description here

The red line is plotted using stat_summary at the minimum value of each group - as you wrote bottom. The blue line is the top (sum) of each group.

CAFEBABE
  • 3,983
  • 1
  • 19
  • 38
  • I am so sorry that I made a mistake in the description of my question . Now I have edited it. Thanks for your answer, it works actually! – user5779223 Jan 23 '16 at 16:40
  • However, you accepted the other one? They are both different as you can see from my comments. – CAFEBABE Jan 23 '16 at 18:15