-1

I have csv file with many columns first column of the csv is user_id. Other columns are realted to different actions that user has taken. I want to plot two columns from this csv file on one ggplot with lines.

userid    Action1TakenTimes Action2TakenTimes
1                    0             4
2                    6             4
3                    0             1
4                    8            23
5                    4             3
6                    1             1

I have converted the csv file to R data table and did the simple plot but I want do a ggplot with a smooth lines connecting the points.

plot(log(mytable.data$Action1TakenTimes))
plot(log(mytable.data$Action2TakenTimes))

I went over following tutorial but couldn't find a similar example: http://www.ceb-institute.org/bbs/wp-content/uploads/2011/09/handout_ggplot2.pdf

add-semi-colons
  • 18,094
  • 55
  • 145
  • 232

1 Answers1

2

Like this?

library(ggplot2)
library(reshape2)
gg <- melt(mytable.data,id="userid")
ggplot(gg,aes(x=userid,y=log(value),color=variable))+geom_line()

ggplot expects the data in so-called "long" format, with all the values in one column, and a second column which distinguishes the different groups. Your data is in "wide" format, with the different groups in different columns. To convert, use melt(...) in the reshape2 package.

This is a very common pattern with ggplot.

One problem with your data is that you're taking log(0), which produces -Inf. Smoothing is meaningless in that situation. If there were no infinities you could add +stat_smooth() to the end of the ggplot(...) line to generate a loess smoothed curve.

jlhoward
  • 58,004
  • 7
  • 97
  • 140
  • 1
    I was just about to press "Post your answer"... The funniest thing is that we both were too lazy to look for a duplicate (which probably exist in great counts) – David Arenburg Aug 14 '14 at 21:07
  • Is there a way to make the lines smoother? – add-semi-colons Aug 14 '14 at 21:08
  • @Null-Hypothesis, have you heard of a great tool called "Google"? Try putting a search query of "plot smooth lines in ggplot" and see magic, only magic. [Here's](http://stackoverflow.com/questions/16789502/r-ggplot2-introduce-slight-smoothing-to-a-line-graph-with-only-a-few-datapoints) one example – David Arenburg Aug 14 '14 at 21:12
  • @DavidArenburg Actually, `stat_smooth(...)` (from the answers to duplicates), doesn't work here, because log(0) produces `NAs` and evidently `loess(...)` doesn't like `NAs`. – jlhoward Aug 14 '14 at 21:20
  • What about `rollmean`? Either way the OP didn't try to look it up first – David Arenburg Aug 14 '14 at 21:21
  • Actually `log(0)` produces `-Inf` rather than `NA`s. `NA`s are not a problem as you can parse `na.rm = T` to `stat_smooth`. The OP also can use other methods rather than just `loess` by using the `method` argument – David Arenburg Aug 14 '14 at 21:52