1

I am working on a car dataset in R. In that, I have one column named as fuel, which is of class factor. So, total number of cars are distributed in 5 types. I want to remove 3 types from that column. An example is as follows:

fuel:  
 CNG     :  40                                                                                      
 Diesel  :2133   
 Electric:   1   
 LPG     :  23                          
 Petrol  :2120

How to remove factor levels CNG, Electric & LPG with one command?

I have tried as below, it works, but I think there is a better way to do, like with a 1 line command.

1.

car <- car[!car$fuel == "CNG", ]
car <- car[!car$fuel == "Electric", ]
car <- car[!car$fuel == "LPG", ]

I tried following way also, but this did not work, Why did the below command not work?

2.

car <- car[!car$fuel == "CNG"||"Electric"||"LPG", ]

2 Answers2

2

A common solution is something along the lines of:

car[!(car$fuel %in% c("CNG", "Electric", "LPG")), ]

For the second solution to work, first you need to use | not || since you are dealing with vectors. Second, you would need to state the logical test to be implemented so R understands:

car[!(car$fuel == "CNG" | car$fuel == "Electric" | car$fuel == "LPG"), ]

Which simplifies by De Morgan's laws:

car[car$fuel != "CNG" & car$fuel != "Electric" & car$fuel != "LPG", ]
s_baldur
  • 29,441
  • 4
  • 36
  • 69
0

To add to sindri_baldur solution, you can use subset like this

# simulate data
set.seed(2)
n <- 12
car <- data.frame(fuel = factor(
  sample.int(5, size = n, replace = TRUE), 
  labels = c("CNG", "Electric", "LPG", "Gas", "Unknown")), 
  id = 1:n)

# show alternative solution
subset(car, fuel != "CNG" & fuel != "Electric" & fuel != "LPG")
#R>      fuel id
#R> 1 Unknown  1
#R> 3 Unknown  3
#R> 5     Gas  5
#R> 6 Unknown  6

subset(car, !fuel %in% c("CNG", "Electric", "LPG"))
#R>      fuel id
#R> 1 Unknown  1
#R> 3 Unknown  3
#R> 5     Gas  5
#R> 6 Unknown  6

You second version fails because you use || and not |. See help("Logic", package = "base") and particularly,

& and && indicate logical AND and | and || indicate logical OR. The shorter form performs elementwise comparisons in much the same way as arithmetic operators. The longer form evaluates left to right examining only the first element of each vector.