-1

Questions: Load the brexit_polls data frame from dslabs: How many polls had a start date (startdate) in April (month number 4)?*

The start date data within brexit_polls data set has multiple years as points but I want to filter only for the month of April.

I have tried using a a regex then april <- brexit_polls %>% regex(startdate,"....-04-..")

I also tried using the tibbletime package but it wouldn't load to my R. Any suggetions?

Bill Hileman
  • 2,798
  • 2
  • 17
  • 24
Austin
  • 1
  • You might want to try the `lubridate` package. – mikebader Aug 30 '21 at 13:03
  • 1
    Please provide enough code so others can better understand or reproduce the problem. – Community Aug 30 '21 at 13:15
  • Please see https://stackoverflow.com/q/5963269, [mcve], and https://stackoverflow.com/tags/r/info for how best to make a question reproducible, including sample data and expected output given that data. Thank you! – r2evans Aug 30 '21 at 14:01

1 Answers1

0
dat <- data.frame(startdate = seq(as.Date("2021-01-01"), len=30, by="week"))
head(dat)
#    startdate
# 1 2021-01-01
# 2 2021-01-08
# 3 2021-01-15
# 4 2021-01-22
# 5 2021-01-29
# 6 2021-02-05

library(dplyr)
dat %>%
  filter("04" == format(startdate, format="%m"))
#    startdate
# 1 2021-04-02
# 2 2021-04-09
# 3 2021-04-16
# 4 2021-04-23
# 5 2021-04-30

dat %>%
  group_by(month = format(startdate, format="%m")) %>%
  tally()
# # A tibble: 7 x 2
#   month     n
#   <chr> <int>
# 1 01        5
# 2 02        4
# 3 03        4
# 4 04        5
# 5 05        4
# 6 06        4
# 7 07        4

dat %>%
  group_by(month = format(startdate, format="%m")) %>%
  tally() %>%
  filter(month == "04")
# # A tibble: 1 x 2
#   month     n
#   <chr> <int>
# 1 04        5

I inferred dplyr, but this works in base as well:

subset(dat, format(startdate, format="%m") == "04")
#     startdate
# 14 2021-04-02
# 15 2021-04-09
# 16 2021-04-16
# 17 2021-04-23
# 18 2021-04-30
r2evans
  • 141,215
  • 6
  • 77
  • 149
  • Why did you use as.Date("2021-01-01")? most of the start date polls are older than that. Also why did you limit it to a length of 30, I need to find all of them. Thank you for the help, just a newbie trying to understand – Austin Aug 30 '21 at 13:26
  • 1
    Austin: *you provided no sample data*, had you given any context of what your data looks like (unknown to us), I would have used it. Regardless, discard the part of my answer where I provide a reproducible dataset `dat`, replace all follow-on mentions to `dat` with the name of your frame (perhaps `brexit_polls`) and attempt my technique. The intent of the answer is to show you a way to filter for a particular month. – r2evans Aug 30 '21 at 13:59
  • Thank you r2evans, that is my fault. The data frame is brexit_polls in R studio, and the specific column I am trying to filter is startdate. It has multiple dates across multiple years so I want to get the multiple-year spans but only for the specific month. – Austin Aug 30 '21 at 21:01
  • I understand, but I don't have `brexit_polls`, don't know where to get it, and honestly I think it does not matter. I gave you a column names `startdate`, so replace `dat` in my answer with `brexit_polls` and see if it works. – r2evans Aug 30 '21 at 21:03