2

I'm trying to create scatterplots using data from B and C, however I'd like to create a difference scatterplot for each category in A.

A           B   C
Monday      2   4
Tuesday     4   2
Monday      3   5
Wednesday   3   10
Friday      7   9

This is the code I currently have is to just make a normal scatterplot with my data. Is there an addition or something that I can use to automatically make scatterplots based on category?

attach(data)
plot(C, B, main="scatterplots",xlab="C", ylab="B", pch=10) 
abline(lm(C~B), col="red")
Andrie
  • 176,377
  • 47
  • 447
  • 496
Mengll
  • 197
  • 1
  • 1
  • 13
  • 1
    I suspect this is a toy example, but I feel compelled to point out that `attach()` is not recommended in R, & you'll need more than 1 B C pair per day for scatterplots to be meaningful, perhaps a different plot is more appropriate. – gung - Reinstate Monica Sep 10 '13 at 16:19
  • @gung; this is not my real data set, but I wanted to illustrate my point. I've seen attach() in a lot of sample scripts I've been looking at for graphing, do you have a reccommended way instead of that? – Mengll Sep 10 '13 at 16:52
  • check out this thread: [in-r-do-you-use-attach-or-call-variables-by-name-or-slicing?](http://stackoverflow.com/questions/1310247/) – gung - Reinstate Monica Sep 10 '13 at 17:15
  • If you use "Formula Notation" notation of `plot`, you can use both `data` and `subset` arguments, e.g. `df <- data.frame(x = 1:10, y = rnorm(10)); plot(y ~ x, data = df, subset = x > 5)` – Henrik Sep 10 '13 at 17:18

2 Answers2

3

This could be a solution:

par(mfrow=c(1, length(levels(A))))
for (day in levels(A)){
  subs <- subset(data, A==day)
  plot(subs$C, subs$B, main=day)
  abline(lm(C~B), col="red")  
}

Note that you have to adjust your xlim and ylim values in case you want to see the red lines for the fit in each plot. Does this help?

user1981275
  • 13,002
  • 8
  • 72
  • 101
  • Thank you! I've created the scatterplots I need. However, they're super scrunched up. I suppose this is what you mean by adjusting the lim values? – Mengll Sep 10 '13 at 16:15
  • This is because the code plots all four plots in one row. You can resize the plotting window manually or create the plots in different windows using for instance the `x11` command. When you save the plots with `jpeg` or `pdf`, you can specify width and height as well. – user1981275 Sep 10 '13 at 17:05
3

Showing a plot conditioned on another variable is what the lattice package was designed to do. In your case it may be as simple as:

library(lattice)
xyplot(B~C|A, data=data, type=c('p','r'))

The ggplot2 package also does this using faceting:

library(ggplot2)
qplot( C, B, data=data, facets= A ~ .) + geom_smooth(method='lm')
Greg Snow
  • 48,497
  • 6
  • 83
  • 110