1

I was trying to select some rows from a dataframe.

data(Grunfeld, package="AER")

gf = Grunfeld[Grunfeld$firm == c("General Electric",
                                 "General Motors",
                                 "US Steel",
                                 "Westinghouse"), ]

The expected output would have 80 rows, but I got 20.

> dim(gf)

[1] 20  5

On the other hand, subset() worked.

gf = subset(x = Grunfeld, firm %in% c("General Electric",
                                      "General Motors",
                                      "US Steel",
                                      "Westinghouse"))

> dim(gf)

[1] 80  5

Anyone knows what is happening here?

Thanks in advance.

EsterRB
  • 15
  • 2

1 Answers1

1

Your first command didn't work because you used == instead of %in%; these perform different operations and so give different results. Try rephrasing your first subsetting command to:

gf = Grunfeld[Grunfeld$firm %in% c("General Electric",
                                 "General Motors",
                                 "US Steel",
                                 "Westinghouse"), ]

There are already answers that give more information on the differences between these operators (such as this one), if you're curious.

Rory S
  • 1,278
  • 5
  • 17
  • 1
    I was not aware of this difference, maybe this is why I could not find this other solution. Thank you very much. – EsterRB Jul 15 '21 at 18:36