1

I've been having some trouble adding in titles to created data frames with for loops. This is the basic structure of the data:

> head(year2019)
        date day_of_week away_team away_team_game_number home_team home_team_game_number away_score home_score day_night  park game_length away_AB away_H away_2B
1 2019-03-20         Wed       SEA                     1       OAK                     1          9          7         N OAK01         204      31      7       1
2 2019-03-21         Thu       SEA                     2       OAK                     2          5          4         N OAK01         267      43      9       4
3 2019-03-28         Thu       PIT                     1       CIN                     1          3          5         D CIN09         174      31      5       0
4 2019-03-28         Thu       ARI                     1       LAN                     1          5         12         D LOS03         169      33      9       4
5 2019-03-28         Thu       COL                     1       MIA                     1          6          3         D MIA02         175      36      9       5
6 2019-03-28         Thu       SLN                     1       MIL                     1          4          5         D MIL06         156      32      5       0

With this code I've been able to create dataframes for each team:

for (i in teams) {
  assign(i, year2019 %>% filter(away_team == i | home_team == i))
}

With teams <- c("ANA", "ARI", "ATL", ...)

However I want to run this with creating both home and away teams. I've tried some of the following but nothing has worked so far:

for(i in teams) { 
  i_home <- i %>% filter(home_team == i)
  i_away <- i %>% filter(away_team == i)
}

Or

for (i in teams) {
  i1 <- filter(year2019, home_team == i)
  i2 <- filter(year2019, away_team == i)
}

Any advice on how to properly introduce added names for i in this?

oszkar
  • 865
  • 5
  • 21
snowsby
  • 23
  • 5
  • Hi snowsby, it will be much easier to help if you can provide the structure of at least part of your data with, for example, `dput(teams)` and `dput(year2019[1:50,])`. See [How to make a reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) for more info. – Ian Campbell Apr 06 '20 at 15:43
  • Trying to assign and name many elements to the workspace is a difficult route, and you will need a lot of dirty loops and eval(parse() and assign() calls. Try instead the concept of dataframe-of-dataframe championed by the package purrr. It might take some time, but will definitely pay off. – Matifou Apr 06 '20 at 15:50
  • [Don't ever create d1 d2 d3, ..., dn in the first place. Create a list d with n elements.](https://stackoverflow.com/a/24376207/1422451). – Parfait Apr 06 '20 at 16:35

1 Answers1

0

Avoid flooding your global environment with many separate team data frames. Instead, use one list of many data frame elements which is easily index-able and searchable. For this, consider by or split.

home_team_dfs <- by(year_2019, year_2019$home_team, identity)
home_team_dfs <- split(year_2019, year_2019$home_team)

# RUN SELECT OPERATIONS ON DATA FRAMES
head(home_team_dfs$ANA)
tail(home_team_dfs$ARI)
summary(home_team_dfs$ATL)


away_team_dfs <- by(year_2019, year_2019$away_team, identity)
away_team_dfs <- split(year_2019, year_2019$away_team)

# RUN SELECT OPERATIONS ON DATA FRAMES
head(away_team_dfs$ANA)
tail(away_team_dfs$ARI)
summary(away_team_dfs$ATL)

Do note you lose no functionality of data frame if it is stored within a list. Therefore, any needed operation (e.g., head, tail, summary) should still be available. Also, you can easily run iterative, consistent, serial-able operations on list like with apply family functions to interact with single, multiple, or all underlying data frame elements.

Parfait
  • 104,375
  • 17
  • 94
  • 125