My dataset has 100 samples and 17000 variables. I would use PCA and visualize data. But the problem is that the plot is not good. How I can control the number of arrows in ggbiplot
or biplot
, in fact select the most contributed variables?
Some sample codes are as below:
data <- matrix(rnorm(1700000), nrow=100, ncol=17000)
colnames(data) <- paste("X", 1:ncol(data), sep="")
pca <- prcomp(data, scale=T, center=T)
biplot(pca)
print(ggbiplot(pca, obs.scale = 1, var.scale = 1,
groups = c(rep('a',30), rep('b',70))))