1

I have 13 variables that each of them has up to 1000000 elements. I want to draw a pairwise scatterplot for them but I couldn't do that because the size of my data was large. Any idea to do such thing? I have tried this:

pairs(data.mat)
graphics.off()

library(GGally)
ggpairs(data.mat, colour='Species', alpha=0.4)

and it couldn't do that and came out from R software.

epo3
  • 2,991
  • 2
  • 33
  • 60
minoo
  • 555
  • 5
  • 20

1 Answers1

1

1) Use a subset of your dataset. You're not going to make meaningful conclusions from 1 million points that you cannot from many fewer.

2) Use pch=".", it speeds things up a surprising amount.

3) Consider plotting the joint distributions rather than the individual points.

mkt
  • 437
  • 7
  • 20