I have a data frame, trainSmall
, with six columns.
> trainSmall
chr pos end LCR gc.50 type
1: 22 39491638 39491639 0 0 del_L
2: 22 29434028 29434029 0 0 ins
3: 22 28347247 28347248 0 0 del_R
4: 22 40121931 40121932 0 0 ins
5: 22 39122351 39122352 0 0 del_L
---
768: 22 27869380 27869381 0 0 del_R
769: 22 28823159 28823160 0 0 ins
770: 22 24319557 24319558 0 0 del_R
771: 22 38570330 38570331 0 0 del_L
772: 22 48182139 48182140 0 0 del_L
> is.data.frame(trainSmall)
[1] TRUE
I also have a vector, excl
, with four items.
> excl
[1] "chr" "pos" "end" "type"
I would like to take all rows of trainSmall
, but only the columns not in excl
. So I tried
> trainSmall[, !colnames(trainSmall) %in% excl]
[1] FALSE FALSE FALSE TRUE TRUE FALSE
But this just gives me another logical vector, not the actual rows from the data frame.
Even doing
> trainSmall[, c(F,F,F,T,T,F)]
[1] FALSE FALSE FALSE TRUE TRUE FALSE
doesn't work as I expected.
I'm pretty confused because this seems to be the method advocated in many places (like this answer) for subsetting a data frame. What am I doing wrong?
Response to possible duplicate flag: None of the solutions there seem to work in this case.
> trainSmall[, -which(names(trainSmall) %in% excl)]
[1] -1 -2 -3 -6
> trainSmall[ , !names(trainSmall) %in% excl]
[1] FALSE FALSE FALSE TRUE TRUE FALSE