5

This is in part related to my question yesterday.

So here is the data and a plot created in ggplot2.

df = data.frame(date=c(rep(2008:2013, by=1)),
                value=c(303,407,538,696,881,1094))


ggplot(df, aes(date, value, width=0.64)) + 
        geom_bar(stat = "identity", fill="#336699", colour="black") +
        ylim(c(0,1400)) + opts(title="U.S. Smartphone Users") +
        opts(axis.text.y=theme_text(family="sans", face="bold")) +
        opts(axis.text.x=theme_text(family="sans", face="bold")) +
        opts(plot.title = theme_text(size=14, face="bold")) +
        xlab("Year") + ylab("Users (in millions)") +        
        opts(axis.title.x=theme_text(family="sans")) +
        opts(axis.title.y=theme_text(family="sans", angle=90)) +
        geom_segment(aes(x=2007.6, xend=2013, y=550, yend=1350), arrow=arrow(length=unit(0.4,"cm")))

Is it possible to produce the squigly trend line in the following graph with ggplot2

I had created the plot in R and then made prettied it up in Adobe Photoshop, and I'm wondering if could have produced that squiggly trend line straight in R.

If this can't be done in ggplot2, are there any specific R packages which would be amenable to this task?

I'm not asking about reproducing the graph. That's not an issue. Just producing the trend line seems to be an issue.

enter image description here

Ghoul Fool
  • 6,249
  • 10
  • 67
  • 125
ATMathew
  • 12,566
  • 26
  • 69
  • 76
  • Is the curviness of the line mathematically determined by the values of the bars in some way? I doubt the shadow could be gotten with ggplot, but a mathematically defined curve with an arrow at the end should be. – Brian Diggs Aug 30 '11 at 15:24
  • @Brian No, it's not mathematically determined. The shadow isn't too important, just that the line isn't a straight line. – ATMathew Aug 30 '11 at 15:28
  • 3
    If @Hadley were here I'd bet he'd say that he hopes that this isn't possible in ggplot2. – joran Aug 30 '11 at 15:29
  • @Andrie Yes, I agree it's chart junk...but I'm still curious as to how to do it in R. – ATMathew Aug 30 '11 at 15:29
  • @joran I agree that my graph doesn't pass the academic test for a quality plot. However, I'm trying to convey a message that will allow me to make money. I'm more than willing to break some statistical rules if that means more people/companies will pay me. :) Ultimately, most consumers don't care about how a chart looks (and neither do most CEO's), so academic norms regarding data visualization don't mean too much to me. – ATMathew Aug 30 '11 at 15:32
  • Since the curve is not derived from the data, I doubt there would be any way to do it in R. Adding it freehand in Photoshop like you did is probably the best way. (And I guess that's why I'm an academic: I'd rather be right than rich. :) ) – Brian Diggs Aug 30 '11 at 15:39
  • 8
    quasi serious comment - dump the output from R into illustrator and add chart junk to your heart's content. It's much more effective at cluttering charts than R is :) – Chase Aug 30 '11 at 16:09
  • 1
    Minor style point: when you define your data frame, `c(rep(2008:2013, by=1))` gives you exactly the same vector as `2008:2013`. – Richie Cotton Aug 30 '11 at 17:11

2 Answers2

8

As per all the comments: the squiggly line is scientifically dubious, since it isn't based upon a statistical model of the data. (And it appears to show the number of smartphone users levelling off, when the data shows no such thing.) So the best advice is "don't do it".

Since you seem really keen on the idea though, yes it is possible.

You can add any line you like with geom_line. To replicate the silly line in your infographic, you could do a straight line plus a sine curve for give it wiggle. Assuming your plot was named p

p + geom_line(
  aes(date, value), 
  data = data.frame(
    date = seq(2008, 2013, length.out = n),
    value = seq(600, 1300, length.out = n) + 100 * sin(seq(0, 4 * pi, length.out = n))  
  ), 
  arrow = arrow(length = unit(0.4, "cm"))
)

A better approach would be to use a loess smoothed curve.

p + geom_smooth(method = "loess", se = FALSE)  #maybe also span = 0.5, for extra wigglyness.
Richie Cotton
  • 118,240
  • 47
  • 247
  • 360
4

The answer to your question clearly is: Do not do this with R. R is not optimized for this. You'll get quicker and closer to what you want if you use some vector graphics program. Use Illustrator, Inkscape (free) or whatever graphics tool you like. Maybe you also want to create the graph in R and modify it later on in one of those programs.

I can understand the comments and also agree that this graph does not pass some tests. Not only academic tests, but also the look-and-feel-not-stuck-in-the-nineties-test. That reference to Edward Tufte was also for aesthetic reasons. However I think it's a legit question that does not deserve the down vote just cause the graph looks ugly. Sometimes academics also need to get their stuff published outside academic journals and graphics end up edited by editors for the masses. So knowing in advance – better do not try to do such stuff with R might be noteworthy.

Matt Bannert
  • 27,631
  • 38
  • 141
  • 207