0

I'm having errors with the normal t-test:

  data <- read.table("/Users/vdas/Documents/RNA-Seq_Smaples_Udine_08032013/GBM_29052013/UD_RP_25072013/filteredFPKM_matrix.txt",sep="",header=TRUE,stringsAsFactors=FALSE)

  PGT <- cbind(data[,2],data[,7],data[,24])
  PDGT <- cbind(data[,6],data[,8])
  pval2 <- NULL
  for(i in 1:length(PGT[,1])){
     pval2 <- c(pval2,t.test(as.numeric(PDGT[i,]),as.numeric(PGT[i,]))$p.value)
     print(i)
  }

Error:

Error in t.test.default(as.numeric(PDGT[i, ]), as.numeric(PGT[i, ])) : 
  not enough 'x' observations

I cannot understand what went wrong with the vector. Can you please tell me? I have not been able to figure it out .

C8H10N4O2
  • 18,312
  • 8
  • 98
  • 134
ivivek_ngs
  • 917
  • 3
  • 10
  • 28
  • 2
    Show us how your data looks like. Use `str` and `summary`. It would be great if you made your error reproducible with a simple example. – Roman Luštrik Aug 06 '13 at 08:48
  • NAs or not (see comments below), if I get your code right, it seems that you are running t-tests on 'x-vectors' (`t.test` terminology) of length 2 (rows in PDGT) against 'y-vectors of length 3 (rows in PGT). Pretty small samples. Please correct me if I am wrong. – Henrik Aug 06 '13 at 14:19
  • Yes this is true , it is 3 vs 2, but yes I am having a lot values NaN after I run the t.test for some rows , and if I have to remove the low abundance expression value then the matrix size will reduce.. – ivivek_ngs Aug 06 '13 at 14:43
  • 1
    A small side note: there is no need to split your data frame in two and loop over rows. You may apply your function row-wise on your original data: `pvals <- apply(X = data, MARGIN = 1, function(dd) t.test(x = dd[c(6, 8)], y = dd[c(2, 7, 24)])$p.value)` – Henrik Aug 06 '13 at 17:12

3 Answers3

5

Most likely your data have NA values. For example: -

x<-rep(NA,4)
t.test(x)

Error in t.test.default(x) : not enough 'x' observations
Bart
  • 17,070
  • 5
  • 61
  • 80
fkliron
  • 83
  • 1
  • 6
  • Yes in my data I have NaN values actually these are expression values of genes so some them are very low so they have 0 value for those. Do you think thats the reason for this error? If so in that case I will not be able to perform the t.test as some rows have 3 fields as 0 in one condition and the other fields in second condition is having values. Am I correct? – ivivek_ngs Aug 06 '13 at 10:22
  • 1
    It seems to me that the missing values are the reason the `t.test` fails. Best thing to do is skip all rows which have any `NA` values. – fkliron Aug 06 '13 at 10:48
1

From you comment, It seems that error come due to the missing value. You can exclude the missing values by setting na.rm=TRUE. Ref:- Missing value . Before posting R question take a look at How to make a great R reproducible example?

Community
  • 1
  • 1
nKandel
  • 2,543
  • 1
  • 29
  • 47
0

To remove NA values you do:

> a <- sample(c(NA, 1:5), 20, replace = TRUE)
> a
 [1] NA  1  2 NA  1  5  4  4  3  3  2  4 NA  4 NA NA  1  2 NA  5
> b <- na.omit(a)
> b
 [1] 1 2 1 5 4 4 3 3 2 4 4 1 2 5
maximus
  • 11,264
  • 30
  • 93
  • 124