0

I'm trying to subset a given dataset by the years of education of the individuals. In particular, I would like to make a smaller dataset with individuals that have only 15 or 16 years of education.

However, when I try to do it through the command | to include both possible values, it gives me back the whole sample. This is not the case If I subset the data to individuals with just 15 or 16 years of edcuation, as it seems to work correctly here. It doesn't when I include both at the same time though.

The line of code I use is this one


dataset_final <- subset(dataset_trade , Q119 == 15 | 16 )

Any idea what might be causing the problem?

ffolkvar
  • 47
  • 4
  • please consider using a [verifiable example](https://stackoverflow.com/help/mcve) (a minimum, complete, and verifiable example) for your questions. Check out [this page](https://stackoverflow.com/a/5963610/4573108) for tips regarding R-specific MCVEs. Even on smal tasks it helps. You could use example dataframes like `mtcars`, they are already build in. – mischva11 Apr 20 '19 at 21:16

2 Answers2

3

You need to correct your logical expression, since the you ask:

Q119 == 15    OR    16

16 is a non zero value, so it's true

so you ask Q119 == 15 OR TRUE

Which returns the whole set, since everything (except zero values) is TRUE

Try:

dataset_final <- subset(dataset_trade , Q119 == 15 | Q119 == 16 )
mischva11
  • 2,811
  • 3
  • 18
  • 34
3

Here is a way avoiding | altogether.

dataset_final <- subset(dataset_trade, Q119 %in% c(15, 16))

This becomes easier and easier as the number of possible values grows.

Rui Barradas
  • 70,273
  • 8
  • 34
  • 66