0

I used ComBat() for batch effect correction in my expression data. basically, that function inputs are expression data, Batch covariate, and Model matrix for the outcome of interest and other covariates besides batch. So, I prepared all inputs based on the instruction of ComBat() function in 'sva' package. "Pheno" is my clinical data inclus batchId of my samples and "TCGA_expr_log" is my expression data. my code is below:

Pheno structure:

Sample        batchId   age  . . . 
GSM71019.CEL    396     63
GSM71020.CEL    396     58
GSM71021.CEL    410     85
GSM71020.CEL    411     58
GSM71021.CEL    410     40
.
.
.
dim(Pheno)
[1] 74 37

TCGA_expr_log structure:

             GSM71019.CEL 1      GSM71019.CEL 2      GSM71019.CEL 3  . . . 
Gene symbol          
     mapt     10.115170           8.628044                  8.779235  
     tp53     5.345168            5.063598                   5.113116
     sep1     6.348024            6.663625                   6.465892     
.
.
.
    dim(TCGA_expr_log)
    [1] 42817    74

batch = Pheno$batchId

TCGA_expr_Co <-  ComBat(as.matrix(TCGA_expr_log),batch = batch,mod = modcombat, par.prior = TRUE,mean.only = TRUE)

but when I run that function, I got below error:

Error in apply(dat[, batch == batch_level], 1, function(x) { : 
  dim(X) must have a positive length

Now, I need anybody comments about my problem. thanks

Mohammad
  • 103
  • 6
  • For me your problem starts with `Error: object 'Pheno' not found` (etc). Please consider making this question *reproducible*. Some good references for how to frame the question to include sample data: https://stackoverflow.com/questions/5963269, https://stackoverflow.com/help/mcve, and https://stackoverflow.com/tags/r/info. – r2evans Dec 01 '19 at 19:26
  • Thanks for @r2evans . I explained more about 2 used data frame in my code. I appreciate it if you share your comment about my problem. – Mohammad Dec 01 '19 at 20:26
  • ***PLEASE*** read the links I provided. Among the discussion there is the suggestion to provide an *unambiguous data sample*, such as with `data.frame(...)` or `dput(head(...))`. This does not mean you need to provide all columns or all rows; quite the contrary, just enough of each to get the point across and demonstrate what you need to get across. Also, don't approximate it ... one sample shows `$batchId` and another `$BatchId`, and while possible, I doubt that you have two names where the only difference is in upper/lower case. Manually typing datasets introduces more unrelated errors. – r2evans Dec 01 '19 at 20:44

0 Answers0