R - Limit output of summary.princomp

Question

I'm running a principal component analysis on a dataset with more than 1000 variables. I'm using R Studio and when I run the summary to see the cumulative variance of the components, I can only see the last few hundred components. How do I limit the summary to only show, say, the first 100 components?

@digemall Not really, the dataset is huge. I'm just running: prin <- princomp(train[c(2:1777)]) summary(prin) When I do that, it shows the info for all 1776 principal components. I only need the first 100 or so. — user1209675, Apr 07 '12 at 16:07
Yes, of course not the full code. I meant a litte example to understand exactly your steps. Anyway @joran got the point ;) — digEmAll, Apr 07 '12 at 16:44

score 2 · Answer 1 · answered Jul 27 '12 at 23:04

2

I tried this and it seems to be working: l = loadings(prin) l[,1:100]

answered Jul 27 '12 at 23:04

wj4f

21
2

joran · Accepted Answer · 2012-04-07T16:36:23.560

It's pretty easy to modify print.summary.princomp (you can see the original code by typing stats:::print.summary.princomp) to do this:

pcaPrint <- function (x, digits = 3, loadings = x$print.loadings, cutoff = x$cutoff,n, ...) 
{
    #Check for sensible value of n; default to full output
    if (missing(n) || n > length(x$sdev) || n < 1){n <- length(x$sdev)}
    vars <- x$sdev^2
    vars <- vars/sum(vars)
    cat("Importance of components:\n")
    print(rbind(`Standard deviation` = x$sdev[1:n], `Proportion of Variance` = vars[1:n], 
        `Cumulative Proportion` = cumsum(vars)[1:n]))
    if (loadings) {
        cat("\nLoadings:\n")
        cx <- format(round(x$loadings, digits = digits))
        cx[abs(x$loadings) < cutoff] <- paste(rep(" ", nchar(cx[1, 
            1], type = "w")), collapse = "")
        print(cx[,1:n], quote = FALSE, ...)
    }
    invisible(x)
}

pcaPrint(summary(princomp(USArrests, cor=TRUE),
              loadings = TRUE, cutoff = 0.2), digits = 2,n = 2)

Edited To include a basic check for a sensible value for n. Now that I've done this, I wonder if it isn't worth suggesting to R Core as a permanent addition; seems simple and like it might be useful.

Thank you so much. Exactly what I needed. This makes datamining applications so much easier. — user1209675, Apr 07 '12 at 16:41
@joran: yes it's a feature that is worth to submit to R-Core team IMO. — digEmAll, Apr 07 '12 at 16:45

score 0 · Answer 3 · answered Apr 07 '12 at 16:23

You can put the loadings in matrix form, you could save the matrix to a variable and then subset (a la matrix[,1:100]) it to see the first/middle/last n. In this example, I've used head(). Each column is a principle component.

head(
  matrix(
    prin$loadings, 
      ncol=length(dimnames(prin$loadings)[[2]]),
      nrow=length(dimnames(prin$loadings)[[1]])
  ),
100)

R - Limit output of summary.princomp

3 Answers3