9

I am working in R on data set of 104500 observations. I want to delete rows based on a column name "state" that has values "TX" and "NY".

I am using the following code

customers <- customers[customers$State != "TX"]

I'm getting the following error

Error: Length of logical index vector must be 1 or 11 (the number of rows), not 104541

Can anyone please help me with this?

lebelinoz
  • 4,890
  • 10
  • 33
  • 56
deadpool
  • 139
  • 1
  • 2
  • 7

2 Answers2

10

I think you missed a comma at the end.

customers <- customers[customers$State != "TX", ]
                                              ^

So you select rows based on your filter, and all columns.

HTH

please provide a reproducible example the next time.

sluedtke
  • 314
  • 1
  • 7
5

I suggest you learn how to use dplyr, and other packages in the tidyverse. I find them to be an indispensable tool in cleaning data.

Here's how I would use dplyr to filter out both Texas and New York in your data set:

library(dplyr)
customers = filter(customers, State != "TX" & State != "NY")

Alternatively,

customers = filter(customers, !(State %in% c("TX", "NY")))
lebelinoz
  • 4,890
  • 10
  • 33
  • 56