My components functions are much shorter than the data given

Question

I'm trying to run a PCA on data of kelp populations over 140 time points and space. However, my principal component functions only have 38 points in them whereas my data has 140. Shouldn't the PC functions be as long as the number of rows of data you give prcomp?

I've used this exact code on a very similar data matrix and the PC functions have had 140 points in them, just like the data.

setwd("C:/Users/hamiltsa/Desktop/OSU/Kelp/Data2")

#Import my dataframe with 140 rows (timepoints) and 13 columns (measurements for each segment of coastline)
d = read.csv("Kelp_segments_quarters_maxes_wide.csv")
head(d)
   Seg1 Seg6 Seg7  Seg8  Seg15 Seg17  Seg18 Seg28 Seg32 Seg36 Seg38 Seg44 Seg53
1    NA   NA   NA    NA     NA    NA     NA    NA    NA    NA    NA    NA    NA
2  7362 1341  297 11664   9045 14301   8109     0   567     0 17001  2412  1152
3 13788 2160 1665 37611 170568 30501 292887     0     0     0     0   324     0
4    NA   NA   NA    NA     NA    NA     NA    NA    NA    NA    NA   459     0
5  3942    0    0  8325  30951    NA   2799     0     0    NA   567   144  1017
6    NA   NA    0  4446   7632 32571  10188     0     0     0 13932  3906     0

PCA2 = prcomp(na.omit(d3), scale = TRUE, center = TRUE) #Don't need to set scale = TRUE because all variables have some units (i.e percent cover)
summary(PCA2)
plot(PCA2$x[,'PC1'], type = "l")

When I plot my first PC of my PCA I expect it to show a function with 140 time points. However, it shows a function with 38 time points. Am I misunderstanding how PCA works or is there something wrong with my code?

You have lots of missing data. PCA will only use complete observations. — Marius, Apr 10 '19 at 00:24
Ooooh you're right. Only 38 rows have no NAs in them. That's the problem. I'll look into methods for PCA on data with missing values. Thanks! — confused_coder, Apr 10 '19 at 00:37

My components functions are much shorter than the data given

0 Answers0