how to remove multiple columns in r dataframe?

Question

I am trying to remove some columns in a dataframe. I want to know why it worked for a single column but not with multible columns e.g. this works

album2[,5]<- NULL

this doesn't work:

album2[,c(5:7)]<- NULL
Error in `[<-.data.frame`(`*tmp*`, , 5:7, value = NULL) : 
replacement has 0 items, need 600

This also doesn't work:

for (i in 5: (length(album2)-1)){
 album2[,i]<- NULL
}
Error in `[<-.data.frame`(`*tmp*`, , i, value = NULL) : 
new columns would leave holes after existing columns

It would be great if you could supply a minimal reproducible example to go along with your question. Something we can work from and use to show you how it might be possible to answer your question. That way others can also befit form your question, and the accompanying answer, in the future. You can have a look at [this SO post](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) on how to make a great reproducible example in R. — Eric Fail, Jan 05 '16 at 17:42
@EricFail especially as, as far as I can tell, the first example "e.g. this works" doesn't actually work. — doctorG, Jan 05 '16 at 17:47
@doctorG using "list(NULL)" made it work with multiple columns , using NULL with a single column worked.i will take care of reproducibility in the future . — Ahmed Elmahy, Jan 05 '16 at 17:57
See [my question here](http://stackoverflow.com/questions/19434778/behavior-of-null-on-lists-versus-data-frames-for-removing-data). — A5C1D2H2I1M1N2O1R2T1, Jan 06 '16 at 02:29
@docendodiscimus can you post your comment as an answer please. Your trick with `list(NULL)` works with named columns as well as column indices. The sole posted solution only works with column indices. — Alex, Oct 04 '17 at 05:00
Also, as in R 3.4.4, `mtc <- mtcars` class(mtc) mtc mtc[, 3] <- NULL mtc mtc[, 4:6] <- NULL mtc ` will show that deleting columns this way does work. — doctorG, Apr 24 '18 at 08:41

score 64 · Answer 1 · answered Jan 05 '16 at 17:43

64

Basic subsetting:

album2 <- album2[, -5] #delete column 5
album2 <- album2[, -c(5:7)] # delete columns 5 through 7

answered Jan 05 '16 at 17:43

doctorG

1,681
1
11
27

7

drop columns by their position is not recommended, at least for me. – Jia Gao Apr 21 '18 at 08:24
Yes, and no. The OP was posed in context of specifying column positions. If you know the desired positions, then this is fine. For others to know if your comment is useful/relevant to them, could you add why you would not recommend it? – doctorG Apr 22 '18 at 15:40
well, what if one adds a new column to his/her data then column position changed? I agree that your answer is correct, but it's neither safe nor efficient. – Jia Gao Apr 23 '18 at 01:10
It's implicit you're at the point of knowing what column numbers you want. Getting to that point is up to you. Considering whether you're doing it interactively or programmatically (and thus what conditions you need to cope with) is also up to you. – doctorG Apr 24 '18 at 08:40

score 42 · Answer 2 · answered Jul 19 '18 at 12:42

Adding answer as this was the top hit when searching for "drop multiple columns in r":

The general version of the single column removal, e.g df$column1 <- NULL, is to use list(NULL):

df[ ,c('column1', 'column2')] <- list(NULL)

This also works for position index as well:

df[ ,c(1,2)] <- list(NULL)

This is a more general drop and as some comments have mentioned, removing by indices isn't recommended. Plus the familiar negative subset (used in other answers) doesn't work for columns given as strings:

> iris[ ,-c("Species")]
Error in -"Species" : invalid argument to unary operator

Can you please explain why `list(NULL)` and not just `NULL`? — vasili111, Sep 18 '19 at 01:32

score 12 · Answer 3 · edited Jun 17 '21 at 14:51

12

This works for me.

x <-dplyr::select(dataset_df, -c('column1', 'column2'))

edited Jun 17 '21 at 14:51

marc_s

732,580
175
1,330
1,459

answered May 14 '20 at 11:29

Dulakshi Soysa

342
3
7

score 9 · Answer 4 · edited Mar 26 '18 at 14:43

9

If you only want to remove columns 5 and 7 but not 6 try:

album2 <- album2[,-c(5,7)] #deletes columns 5 and 7

edited Mar 26 '18 at 14:43

Yoh Deadfall

2,711
7
28
32

answered Mar 26 '18 at 14:25

Kara

111
1
4

score 7 · Answer 5 · answered Jan 24 '19 at 13:59

7

@Ahmed Elmahy following approach should help you out, when you have got a vector of column names you want to remove from your dataframe:

test_df <- data.frame(col1 = c("a", "b", "c", "d", "e"), col2 = seq(1, 5), col3 = rep(3, 5))
rm_col <- c("col2")
test_df[, !(colnames(test_df) %in% rm_col), drop = FALSE]

All the best, ExploreR

answered Jan 24 '19 at 13:59

ExploreR

313
4
15

what is drop doing in this context? – Kyouma Jun 24 '20 at 21:43

score 2 · Answer 6 · answered Nov 24 '21 at 10:09

Another solution, similar to @Dulakshi Soysa, is to use column names and then assign a range.

For example, if our data frame df(), has column names defined as column_1, column_2, column_3 up to column_15. We are interested in deleting the columns from the 5th to the 10th.

We can specify a range using column names e.g.,

library(dplyr)
x = select(df, -c('column_5':'column_10'))

Specifying the range can save some time when you are deleting multiple adjacent columns. It can also be used if you want to use some adjacent and some non-adjacent columns. For example, if you want to remove the 1st column in addition to the previously specified columns, you would update the code as below:

library(dplyr)
x = select(df, -c('column_1', 'column_5':'column_10'))

score 1 · Answer 7 · edited Apr 22 '21 at 00:28

1

The following line will remove col_1 and col_2 from the data frame 'data'

data[!(colnames(data) %in% c('col_1','col_2'))]

edited Apr 22 '21 at 00:28

Anoushiravan R

21,622
3
18
41

answered Apr 22 '21 at 00:24

Jacob

11
1

score 1 · Answer 8 · answered Apr 22 '21 at 00:37

Here is an interesting solution I read the other day in @JoachimSchork's blog, Statistics Globe. You can remove columns by column name. You can find out more here.

library(data.table)

mtcars2 <- mtcars

setDT(mtcars2)[, c("mpg", "cyl", "disp", "hp") := NULL]

> head(mtcars2)
   drat    wt  qsec vs am gear carb
1: 3.90 2.620 16.46  0  1    4    4
2: 3.90 2.875 17.02  0  1    4    4
3: 3.85 2.320 18.61  1  1    4    1
4: 3.08 3.215 19.44  1  0    3    1
5: 3.15 3.440 17.02  0  0    3    2
6: 2.76 3.460 20.22  1  0    3    1

how to remove multiple columns in r dataframe?

8 Answers8

Linked

Related