I want to get a by group (year) pair-wise correlations for a large number of observations.
When not using a for loop I get the result I want, that is:
ddply(mydata, .(year), summarise, corr=cor(x, y, use="pairwise.complete.obs"))
The result I want:
1 1 0.8366892
2 2 0.8929666
3 3 0.8364396
4 4 0.6201038
5 5 0.8914541
But when I use a for loop to run through the columns of my data set like:
for (i in 1:length(x))
ddply(mydata, .(year), summarise, corr=cor(x[[i]], y[[i]], use="pairwise.complete.obs"))
I get:
grp corr
1 1 0.835378
2 2 0.835378
3 3 0.835378
4 4 0.835378
5 5 0.835378
Which is the average correlation across the different years
Am I not understanding the way ddply works?