Filter dataframe by how many times they occur et al

Question

I have this dataframe called Revenue where some dates, cities and revenues are illustrated.

Date                      City               Revenue
1989-02-25                LA                 50
1989-02-25                NY                 72
1989-02-25                PAR                65
1989-02-25                ROM                71
1989-02-26                NY                 82
1989-02-26                BAC                73
1989-02-27                TOK                55
1989-02-27                BTH                83
1989-02-27                PAR                69
1989-02-27                NY                 70
1989-02-28                NY                 45
1989-03-01                HEL                95
#With 7000 more rows

What I'm trying to do is to select dates which occurs four times, in this example above 1989-02-25 and 1989-02-27 and so forth. The tibble should look something like this:

Date                      City               Revenue
1989-02-25                LA                 50
1989-02-25                NY                 72
1989-02-25                PAR                65
1989-02-25                ROM                71
1989-02-27                TOK                55
1989-02-27                BTH                83
1989-02-27                PAR                69
1989-02-27                NY                 70
#With 1251 more rows

Next step is to filter dates so only dates that has a revenue at or above 45 is included my tibble. The first rows will look like above but there should be a reduced amount of rows.

After that the tibble should be constrained by showing the lowest amount of a revenue per a date. So it looks like this (city is removed here) Revenue$city <- NULL:

Date                        Revenue
1989-02-25                  50
1989-02-27                  55
#With 57 more rows

Anyone has any ideas? Quite challenging with so many steps.

One question per question please! If you look up each of these steps separately, you'll be able to find answers. — socialscientist, Aug 04 '22 at 16:21
Does this answer your question? [Counting the number of elements with the values of x in a vector](https://stackoverflow.com/questions/1923273/counting-the-number-of-elements-with-the-values-of-x-in-a-vector) — socialscientist, Aug 04 '22 at 16:23
Does this answer your question? [how to filter data by the number of unique values in R](https://stackoverflow.com/questions/58269779/how-to-filter-data-by-the-number-of-unique-values-in-r) — user438383, Aug 04 '22 at 16:47
@dcsuka I'm looking for lowest value above 45. Will take a look at the answer below. — Henry Oufh, Aug 04 '22 at 19:35

score 2 · Answer 1 · answered Aug 04 '22 at 16:29

Here is a solution that involves some grouped filtering.

df <- read.table(text = "Date                      City               Revenue
1989-02-25                LA                 50
1989-02-25                NY                 72
1989-02-25                PAR                65
1989-02-25                ROM                71
1989-02-26                NY                 82
1989-02-26                BAC                73
1989-02-27                TOK                55
1989-02-27                BTH                83
1989-02-27                PAR                69
1989-02-27                NY                 70
1989-02-28                NY                 45
1989-03-01                HEL                95") %>%
  janitor::row_to_names(1) %>%
  as_tibble() %>%
  mutate(Date = lubridate::ymd(Date),
         Revenue = as.integer(Revenue)) %>%
  group_by(Date) %>%
  filter(n() == 4,
         Revenue > 45) %>%
  summarise(Revenue = min(Revenue))

# # A tibble: 2 × 2
#   Date       Revenue
#   <date>       <int>
# 1 1989-02-25      50
# 2 1989-02-27      55

Filter dataframe by how many times they occur et al

1 Answers1