0

So, in the code below, completed_games is a dataframe containing all college football games involving at least 1 FBS team that have been completed up to this point. VoA_Variables is a dataframe containing stats and my ratings and rankings for each team. FCS is a dataframe containing ratings for FCS college football teams that I want to reference whenever Air Force played an FCS team. Basically, I want home_VoA_rating to be the VoA rating of the home team pulled in from VoA_Variables if it is an FBS team and the rating from the FCS df if it is an FCS team. I want the same thing for away_VoA_rating. The filter is working perfectly when I run that without the mutate following it, but I'm running into trouble in the mutate() function and getting an error that says "Error in mutate(): ! Problem while computing home_VoA_rating = case_when(...). Caused by error in case_when(): ! TRUE ~ FCS$rating[FCS$team == home_team] must be length 10 or one, not 0."

AirForce <- completed_games |>
  filter(home_team == "Air Force" | away_team == "Air Force") |>
  mutate(home_VoA_rating = case_when(home_team %in% VoA_Variables$team ~ VoA_Variables$VoA_Rating[VoA_Variables$team == home_team],
                                     TRUE ~ FCS$rating[FCS$team == home_team]),
         away_VoA_rating = case_when(away_team %in% VoA_Variables$team ~ VoA_Variables$VoA_Rating[VoA_Variables$team == away_team],
                                     TRUE ~ FCS$rating[FCS$team == away_team]),
         actual_diff = case_when(home_team == 'Air Force' ~ home_points - away_points,
                                 TRUE ~ away_points - home_points),
         projected_diff = case_when(home_team == 'Air Force' ~ home_VoA_rating - away_VoA_rating,
                                    TRUE ~ away_VoA_rating - home_VoA_rating),
         Resume_Score = projected_diff - actual_diff)

If I take out the case_when statements following home_VoA_rating and away_VoA_rating and replace them with 0 (so, home and away ratings would then equal 0 for each game in the dataframe), the code runs the way it is supposed to. So the issue is just the process of assigning the appropriate rating to the appropriate team based on whether they are an FBS team or not and whether they are home or not.

gshelor
  • 5
  • 2
  • 1
    Please show a small reproducible example with `dput` – akrun Nov 18 '22 at 18:19
  • The error is related to the length i.e. you are subsetting from FCS data, but `case_when` requires all arguments to be of same length. In your case, even if it works, it can be still length difference. i.e. the specific error is when you don't have any element i.e. FCS$team == home_team is all FALSE – akrun Nov 18 '22 at 18:24
  • It looks kind of like you're trying to manually code a join? Hard to tell without sample input and desired output. – Gregor Thomas Nov 18 '22 at 18:44
  • @GregorThomas so Air Force has a rating of 3.5, so if they're the home team for a game, home_VoA_rating should be 3.5 and away_VoA_rating should be whatever the rating of the other team is. But Air Force isn't the home team in every game, so when they're not, 3.5 should be the away_VoA_rating and home_VoA_rating should be whatever the other team's rating is. And Air Force's opponent was a different team for every game, so that number should always be different. – gshelor Nov 18 '22 at 18:57
  • Yeah, still hard to understand without sample input and desired output. Please add sample input and desired output. See the site help [How to create a minimal reproducible example?](https://stackoverflow.com/help/minimal-reproducible-example) or the R-specific [How to make a great R reproducible example?](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). `dput()` is a very easy way to share data copy/pasteably, `dput(your_data[1:10, ])` makes a copy/pasteable version of the first 10 rows of `your_data`, including all class and structure info. – Gregor Thomas Nov 19 '22 at 01:55

0 Answers0