0

I currently have baseball data and hoping to filter a data frame to the games a player has played in during the week prior and after his birthday. Having trouble filtering just by the month and day.

Code:

#create variable that is the month-day of the start of the 2017 season
start_2017 = format(as.Date(seasons_g_2017[seasons_g_2017$name == 2017,]$starts_on), "%m%d")

#create variable that is the month-day of the end of the 2017 season
end_2017 = format(as.Date(seasons_g_2017[seasons_g_2017$name == 2017,]$ends_on), "%m%d")

#create column in players_data that shows month-day of the player's birthday
players_data$birth_date_filter = as.Date(players_data$birth_date,"%m%d")

#filter players_data to only players who have a birthday during the actual season
players_data = players_data[players_data$birth_date_filter >= start_2017 & 
players_data$birth_date_filter <= end_2017,]
Artem
  • 3,304
  • 3
  • 18
  • 41
BH57
  • 271
  • 2
  • 7
  • 17
  • 3
    Please share sample of your data using `dput()` (not `str` or `head` or picture/screenshot) so others can help. See more here https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example?rq=1 – Tung Sep 18 '18 at 00:02
  • `lubridate` is a handy r tool for date/time work. I can't help much more without sample data. – Paul Sep 18 '18 at 01:20
  • @BH57, could you provide in your question the desired behavior, a specific problem or error and the shortest code necessary to reproduce it in the question itself. Questions without a clear problem statement are not useful to other readers. See: [How to create a Minimal, Complete, and Verifiable example](https://stackoverflow.com/help/mcve)." – Artem Sep 23 '18 at 09:18

1 Answers1

0

You can use the functionality of lubridate package to handle the date time objects. For your data you just need to add 2017 to your datatime strings then convert it to datatime objects. After it you will be easily make comparisons and filtering:

library(lubridate)

# simulation
seasons_g_2017 = data.frame(name = 2017, 
                            starts_on = "April 2",
                            ends_on = "November 1", stringsAsFactors = FALSE)

players_data = data.frame(name = c("John", "Bill"), birth_date = c("January 15", "June 15"), stringsAsFactors = FALSE)


# OP's data
# create variable that is the month-day of the start of the 2017 season
start_2017 <- ymd(paste0(2017, seasons_g_2017[seasons_g_2017$name == 2017, ]$starts_on, sep = " "))

# create variable that is the month-day of the end of the 2017 season
end_2017 <- ymd(paste0(2017, seasons_g_2017[seasons_g_2017$name == 2017, ]$ends_on, sep = " "))

# create column in players_data that shows month-day of the player's birthday
players_data$birth_date_filter <- ymd(paste0(2017, players_data$birth_date, sep = " "))

# filter players_data to only players who have a birthday during the actual season
players_data <- players_data[players_data$birth_date_filter >= start_2017 & 
                              players_data$birth_date_filter <= end_2017,]
players_data

Output:

  name birth_date birth_date_filter
2 Bill    June 15        2017-06-15
Artem
  • 3,304
  • 3
  • 18
  • 41