I have a dataframe with lots of countries and their total cases and new cases on different dates. It looks as follows:
iso_code continent location date total_cases new_cases stringency_index population
<chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 ABW North America Aruba 2020-03-13 2 2 0 106766
2 ABW North America Aruba 2020-03-19 NA NA 33.3 106766
3 ABW North America Aruba 2020-03-20 4 2 33.3 106766
4 ABW North America Aruba 2020-03-21 NA NA 44.4 106766
5 ABW North America Aruba 2020-03-22 NA NA 44.4 106766
6 ABW North America Aruba 2020-03-23 NA NA 44.4 106766
I am able to filter the dataframe to get all rows where new_cases >= 5:
df_filtered <- df %>% filter(new_cases >= 5)
However, this gives me all rows where new_cases are equal to or greater than 5:
iso_code continent location date total_cases new_cases stringency_index population
<chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 ABW North America Aruba 2020-03-24 12 8 44.4 106766
2 ABW North America Aruba 2020-03-25 17 5 44.4 106766
3 ABW North America Aruba 2020-03-27 28 9 44.4 106766
4 ABW North America Aruba 2020-03-30 50 22 85.2 106766
5 ABW North America Aruba 2020-04-01 55 5 85.2 106766
6 ABW North America Aruba 2020-04-03 60 5 85.2 106766
How can I only get the row with the earliest/first date where this condition holds?
This is what my output would ideally look like:
iso_code continent location date total_cases new_cases stringency_index population
<chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 ABW North America Aruba 2020-03-24 12 8 44.4 106766
2 AFG Asia Afghanistan 2020-03-16 16 6 38.9 38928341
3 AGO Africa Angola 2020-04-19 24 5 90.7 32866268
4 ALB Europe Albania 2020-03-13 23 12 78.7 2877800
5 AND Europe Andorra 2020-03-17 14 9 31.4 77265
6 ARE Asia Utd. Arab Emirates 2020-02-28 19 6 8.3 9890400