R loop (or apply) that creates separate dataframes via subset

Question

I have this example data frame.

df <- data.frame (MARKET  = c("US", "US", "UK", "UK", "China", "China", "Brazil", "Brazil", "US", "US", "UK", "UK", "China", "China", "Brazil", "Brazil","US", "US", "UK", "UK", "China", "China", "Brazil", "Brazil","US", "US", "UK", "UK", "China", "China", "Brazil", "Brazil","US", "US", "UK", "UK", "China", "China", "Brazil", "Brazil"),
                  MEAL = c("Breakfast", "Lunch", "Dinner", "Breakfast", "Lunch", "Dinner","Breakfast", "Lunch", "Dinner","Breakfast", "Lunch", "Dinner","Breakfast", "Lunch", "Dinner","Breakfast", "Lunch", "Dinner","Breakfast", "Lunch", "Dinner","Breakfast", "Lunch", "Dinner","Breakfast", "Lunch", "Dinner","Breakfast", "Lunch", "Dinner", "Breakfast", "Lunch", "Dinner", "Breakfast", "Lunch", "Dinner", "Breakfast", "Lunch", "Dinner", "Breakfast")
)

And I want to create separate subsets of the data frame that contain each combination of the meals and markets (i.e. Brazil_Breakfast, Brazil_Lunch, Brazil_Dinner, etc).

I take the row names from each variable here.

markets <- rownames(table(df$MARKET))
meals <- rownames(table(df$MEAL))

I know I can subset one of these like so

brazil_breakfast <- subset(df, MARKET==markets[1] & MEAL==meals[1])

But I would like to be able to automate this. Here's the draft of the for loop I drafted.

for (i in length(markets)) {
  for (j in length(meals)) {
    i_j <- subset(df, MARKET==markets[i] & MEAL==meals[j]) 
  }
}

But this only creates the last combination, US and Lunch, and it's actually, literally named i_j.

How do I create a separate, new data frames via for loops? Also happy to use an apply statement.

Thank you!

You need `split`, something like `df_list = split(df, df[c("MARKET", "MEAL")])`. If you really want them in the global environment, you can use `list2env(df_list)`, but in most cases you'll be better off keeping them in a `list` (or not splitting them at all... not sure why you want to do this but you can do **a lot** "by group" with `dplyr` or `data.table`) — Gregor Thomas, Nov 21 '22 at 19:57

score 0 · Answer 1 · answered Nov 21 '22 at 19:57

IIUC this should do:

df <- data.frame (MARKET  = c("US", "US", "UK", "UK", "China", "China", "Brazil", "Brazil", "US", "US", "UK", "UK", "China", "China", "Brazil", "Brazil","US", "US", "UK", "UK", "China", "China", "Brazil", "Brazil","US", "US", "UK", "UK", "China", "China", "Brazil", "Brazil","US", "US", "UK", "UK", "China", "China", "Brazil", "Brazil"),
                  MEAL = c("Breakfast", "Lunch", "Dinner", "Breakfast", "Lunch", "Dinner","Breakfast", "Lunch", "Dinner","Breakfast", "Lunch", "Dinner","Breakfast", "Lunch", "Dinner","Breakfast", "Lunch", "Dinner","Breakfast", "Lunch", "Dinner","Breakfast", "Lunch", "Dinner","Breakfast", "Lunch", "Dinner","Breakfast", "Lunch", "Dinner", "Breakfast", "Lunch", "Dinner", "Breakfast", "Lunch", "Dinner", "Breakfast", "Lunch", "Dinner", "Breakfast"),
                  value = 1:40)
)


nes = df %>% group_split(MARKET, MEAL)


# <list_of<
#   tbl_df<
#   MARKET: factor<76aa3>
#   MEAL  : factor<4f900>
#   value : integer
# >
#   >[12]>
#   [[1]]
# # A tibble: 4 × 3
# MARKET MEAL      value
# <fct>  <fct>     <int>
#   1 Brazil Breakfast     7
# 2 Brazil Breakfast    16
# 3 Brazil Breakfast    31
# 4 Brazil Breakfast    40
# 
# [[2]]
# # A tibble: 3 × 3
# MARKET MEAL   value
# <fct>  <fct>  <int>
#   1 Brazil Dinner    15
# 2 Brazil Dinner    24
# 3 Brazil Dinner    39
# 
# [[3]]
# # A tibble: 3 × 3
# MARKET MEAL  value
# <fct>  <fct> <int>
#   1 Brazil Lunch     8
# 2 Brazil Lunch    23
# 3 Brazil Lunch    32


.
.
.

R loop (or apply) that creates separate dataframes via subset

1 Answers1