-1

I have an extensive data set and am trying the get a sample meeting several conditions.

I want my sample to show only observations where variable (type) is a, b, or c additionally showing only observations where variable (time) is between years 2010 until 2017 (the dataset has observations from 2010 until 2018).

I have been trying nameDataset=="a", "b", "c" & … But honestly not sure how to tackle this problem.

  • 1
    Hi R. Salzmann. Welcome to StackOverflow! Please read the info about how to give a [minimale reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610). That way you can help others to help you! Regarding your question: We can index rows in base R using conditions. For example: `nameDatase[nameDataset$type %in% c("a", "b", "c") & nameDataset$Years >= 2010 & nameDataset$Years <= 2017, ]` – dario Mar 02 '20 at 10:33

1 Answers1

0

Here is a way to filter using the dplyr package (since there is no data provided I used the iris dataset) :

suppressPackageStartupMessages( library(dplyr) )
iris <- iris %>% 
        as_tibble() %>% 
        mutate(Species = as.character(Species))
iris %>% 
        filter(Species %in% c("setosa", "virginica") &
                       Sepal.Length %>% between(4.5, 5)
               )
#> # A tibble: 25 x 5
#>    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#>           <dbl>       <dbl>        <dbl>       <dbl> <chr>  
#>  1          4.9         3            1.4         0.2 setosa 
#>  2          4.7         3.2          1.3         0.2 setosa 
#>  3          4.6         3.1          1.5         0.2 setosa 
#>  4          5           3.6          1.4         0.2 setosa 
#>  5          4.6         3.4          1.4         0.3 setosa 
#>  6          5           3.4          1.5         0.2 setosa 
#>  7          4.9         3.1          1.5         0.1 setosa 
#>  8          4.8         3.4          1.6         0.2 setosa 
#>  9          4.8         3            1.4         0.1 setosa 
#> 10          4.6         3.6          1           0.2 setosa 
#> # ... with 15 more rows

You can then store the result in an object when you are satisfied (it should work with dates too).

cbo
  • 1,664
  • 1
  • 12
  • 27