2

I'm using the data here (specifically the cpi and grosses tibbles) and I would like to calculate each of the columns containing dollar amounts (denoted by having 'gross' or 'price' in their names) in terms of their 2020 value. I would then like to order these columns directly after their matches.

The following method was inspired by the post here:

cpi_recent <- cpi %>% # pulls most recent cpi in the tibble
  select(cpi) %>% 
  slice_tail() %>%
  pull()

grosses_adj <- grosses %>%
  mutate(year_month = floor_date(week_ending, 'month')) %>% 
  left_join(cpi, 'year_month') %>%
  mutate(across(contains(c('gross', 'price')),
                list(adj = ~ cpi_recent/cpi * .))) %>% # creates new col in 2020 dollars w/ _adj suffix
  select(-year_month, -cpi)

However, this will place all new columns after the last column, by default.

Is there any way to dynamically order the new columns directly after the matching column, like: gross_1, gross_1_adj, price_2, price_2_adj, etc.? Of course this is trivial using select() manually, but I'm assuming there is some way to reference these columns dynamically using mutate() and the .after argument.

2 Answers2

0

You can do this by following the approach from Stackoverflow: How to relocate several columns in one step using dplyr::relocate?

First of all, you'll need to store the variable names that you are you are using in across selection in a vector or list.

gross_price_vars <- names(grosses) %>% str_subset("gross|price")

gross_price_vars
# [1] "weekly_gross_overall" "weekly_gross" "potential_gross"  
#      "avg_ticket_price" "top_ticket_price"  

With the help of reduce from the purrr package you can relocate the freshly generated variables after the ones you used for calculation and whose names you stored in gross_price_vars.

 grosses_adj %>%
    reduce(.x = gross_price_vars, 
           .f = ~ relocate(.x, str_c(.y, "_adj"), .after = .y), 
           .init = .)

 # week_ending week_number weekly_gross_overall  weekly_gross_overall_adj show     theatre weekly_gross weekly_gross_adj
 # <date>            <dbl>                <dbl>                    <dbl> <chr>     <chr>          <dbl>            <dbl>
 # 1 1985-06-09          1              3915937                 9576099. 42nd Str~ St. Ja~       282368          690508.
 # 2 1985-06-09          1              3915937                 9576099. A Chorus~ Sam S.~       222584          544311.
 # 3 1985-06-09          1              3915937                 9576099. Aren't W~ Brooks~       249272          609574.
 # 4 1985-06-09          1              3915937                 9576099. Arms and~ Circle~        95688          233997.
 # 5 1985-06-09          1              3915937                 9576099. As Is     Lyceum~        61059          149315.  

Keep in mind, that the .x inside relocate is another object than the .x outside from it. Inside relocate it referes to the .init argument, which is the data.frame, while .y refereres to the names the variable list (the .x argument from the reduce function).

The Variables generated from the mutate(across ...) function have the postfix "_adj" attached to it's original name. In that way we identify the new variables by str_c(old_varname, "_adj").

mirirai
  • 1,365
  • 9
  • 25
0

I would do something simpler, you can use peek_vars to put the columns in order if you're conscientious about how you're naming your new columns (I usually use prefixes or suffixes to select). It looks like it would work fine in this case.

The first columns (week_ending, week_number) will not be part of the sort, everything else after that will be alphabetical.

grosses_adj %>% 
  select(week_ending, week_number, sort(tidyselect::peek_vars()))
kpress
  • 136
  • 6