1

I'm looking to a scatter plot with a rotated X axis. Basically, I want to plot correlations between 2 Y-axes. Ideally, I'd like to have the x-axis represent the time and Y-axes represent the correlations

data <- data.frame( words = c( "Aliens", "Aliens", "Constitution", "Constitution",    "Entitled", "Entitled" ),
              dates =  as.Date( c ("2010-01-05", "2010-02-13", "2010-04-20", "2010-06-11","2010-03-18", "2010-09-13" )), 
                    Rep =    c( .18, .14, .16, .45, .33, .71 ), Dem = c( .16, .38, .24, .11, .59, .34 ))

And this is what I was able to do so far. I don't think it really gets the point across. I could size by correlation and color by month?

plot(x=data$dates, y=data$Rep, ylim=c(0,1.1*max(data$Rep)),
 col='blue', pch = 15,
 main='Rep Correlations stock close', xlab='date', ylab='Republican')
axis(2, pretty(c(0, 1.1*max(data$Rep))), col='blue')
par(new=T)
plot(x=data$date, y=data$Dem, ylim=c(0,1.1*max(data$Dem)),
 col='green', pch = 20,
 xaxt='n', axes = F, xlab = '', ylab='')
axis(4, pretty(c(0, 1.1*max(data$Dem))), col='green')
mtext("Democrat",side=4)

Any thoughts/tips?

Paul Hiemstra
  • 59,984
  • 12
  • 142
  • 149
crock1255
  • 1,025
  • 2
  • 12
  • 23
  • If you want to look at the correlation between `Rep` and `Dem`, then you should use a bivariate plot instead of 2 y axes. You could use color to encode time, like you suggest, but another nice way is with a motion chart. Like you also mention, this lets you even encode a 3rd variable using the point size. This is a "motion bubble chart". Here is an example that shows the effect very nicely: http://code.google.com/p/google-motion-charts-with-r/ – John Colby Apr 14 '12 at 22:32
  • Thanks! I played around with the motion chart, but from what I could find/code, the googlviz version only allows time to be in days or years. In other words, I couldn't get the dates to sequence by month. I'm also not too familiar with bivariate plots. Is this what you meant? [graph-gallery](http://addictedtor.free.fr/graphiques/graphcode.php?graph=104) – crock1255 Apr 15 '12 at 19:51

1 Answers1

2

Following up on @JohnColby's comment above (and see How can I plot with 2 different y-axes? , http://rwiki.sciviews.org/doku.php?id=tips:graphics-base:2yaxes for arguments why you should not create dual y-axis plots if you can help it ), how about:

dat <- data ## best not to use reserved words -- it can cause confusion
library(ggplot2)
theme_update(theme_bw())  ## I prefer this theme
## code months as a factor
dat$month <- factor(months(dat$dates),levels=month.name)
dat <- dat[order(dat$dates),]
qplot(Rep,Dem,colour=month,data=dat)+
    geom_path(aes(group=1),colour="gray")+geom_point(alpha=0.4)+
    geom_text(aes(label=words),size=4)

(adding lines between the points, then re-plotting the points so they're not obscured by the line; adding the words is cute but might be too much clutter for the full data set)

enter image description here

Or encode date as a continuous variable

ggplot(dat,aes(Rep,Dem,colour=dates))+
    geom_path(aes(group=1),colour="gray")+geom_point(alpha=0.4)+
    geom_text(aes(label=words),size=4)+
    expand_limits(x=c(0,0.9))
ggsave("plotcorr2.png",width=6,height=3)

enter image description here

In this particular context (where both variables are measured on the same scale), there's also nothing wrong with plotting them both against the date axis:

library(reshape2)
library(plyr)
m1 <- rename(melt(dat,id.vars=c("words","dates","month")),
             c(variable="party"))

ggplot(m1,aes(dates,value,colour=party))+geom_line()+
    geom_text(aes(label=words),size=3)+
    expand_limits(x=as.Date(c("2009-12-15","2010-10-01")))
ggsave("plotcorr3.png",width=6,height=3)

enter image description here

Community
  • 1
  • 1
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • I like this take. My understanding of the dual axes problem was if they were on different scales. My thinking was that in this case, it would be alright since they are on the same scales, but I like this take on it. Thanks! – crock1255 Apr 15 '12 at 19:57
  • you're right. I wasn't thinking. Plotting both variables on the same scale is fine in this case. `matplot` would do it in base R graphics, reshaping is the preferred approach with `ggplot`. – Ben Bolker Apr 15 '12 at 20:50