Fill cells dataframe based on multiple conditions

Question

How to fill cells based on multiple conditions?

There are a lot of players (columns) in this game, but I only included 2 for the sake of this example. I want to loop over a lot of players.
Every row represents a game round.

Conditions:

IF player00[i] score = 0 &

IF lossallowed00[i] = "no"

THEN Fill flag00[i] with "FLAG"

df <-data.frame(
  player001 = c(1,0,3),
  player002 = c(1,0,5),
  lossallowed001 = c("no", "yes", "no"),
  lossallowed002 = c("no", "no", "yes"),
  flag001 = NA,
  flag002 = NA
)

#desired output:
#player001 player002 lossallowed001 lossallowed002 flag001 flag002
#    1         1             no             no      NA      NA
#    0         0            yes             no      NA      FLAG
#    3         5             no            yes      NA      NA

You'll probably want to pivot this into an `round | player_id | score | lossallowed | flag` format. Can you be sure of having no more than `999` players? — Greg, Jan 10 '22 at 18:43
FYI I just fixed a typo where you had "loseallowed" instead of "lossallowed" that would cause columns to not match — camille, Jan 10 '22 at 18:51

score 2 · Answer 1 · answered Jan 10 '22 at 19:00

If you use a method of reshaping to long format, splitting out the IDs based on the pattern of column names being variables made of letters and IDs being made of numbers, you can do the operation all at once in a couple lines and reshape back to wide. Using regex means you're not bound by either the number of players or the names of columns. I added an ID column for the games to differentiate rows; you could drop it afterward.

The reshaping itself is covered pretty extensively already (Reshaping multiple sets of measurement columns (wide format) into single columns (long format) for example) but is useful for problems that need to scale like this.

library(dplyr)

df %>%
  tibble::rowid_to_column(var = "game") %>%
  tidyr::pivot_longer(-game, names_to = c(".value", "num"), 
                      names_pattern = "(^[a-z]+)(\\d+$)") %>%
  mutate(flag = ifelse(player == 0 & lossallowed == "no", "FLAG", NA_character_)) %>%
  tidyr::pivot_wider(id_cols = game, names_from = num, values_from = player:flag, 
                     names_glue = "{.value}{num}")
#> # A tibble: 3 × 7
#>    game player001 player002 lossallowed001 lossallowed002 flag001 flag002
#>   <int>     <dbl>     <dbl> <chr>          <chr>          <chr>   <chr>  
#> 1     1         1         1 no             no             <NA>    <NA>   
#> 2     2         0         0 yes            no             <NA>    FLAG   
#> 3     3         3         5 no             yes            <NA>    <NA>

Maël · Answer 2 · 2022-01-10T19:14:46.270

You can do this. First reshape the data, and then add the column. Use bind_cols if you want the data to be merged back.

library(purrr)
library(dplyr)

map(set_names(paste0("00", 1:2)), ~ select(df, ends_with(.x))) %>% 
  map(., ~ mutate(., newcol = ifelse(.[[1]] == 0 & .[[2]] == "no", "FLAG", NA)))

$`001`
  player001 lossallowed001 flag001 newcol
1         1             no      NA     NA
2         0            yes      NA     NA
3         3             no      NA     NA

$`002`
  player002 loseallowed002 flag002 newcol
1         1             no      NA   <NA>
2         0             no      NA   FLAG
3         5            yes      NA   <NA>

Since they said they'll have a lot more players than just those 2, you could do something to create a vector of player IDs to iterate over so they're not hard-coded — camille, Jan 10 '22 at 19:07

PaulS · Answer 3 · 2022-01-10T18:58:37.963

A possible solution:

library(tidyverse)

df <-data.frame(player001 = c(1,0,3), player002 = c(1,0,5),lossallowed001 = c("no", "yes", "no"), loseallowed002 = c("no", "no", "yes"),flag001 = NA, flag002 = NA)

df %>%
  rownames_to_column("id") %>% 
  mutate(across(where(is.numeric), as.character)) %>% 
  pivot_longer(cols = -id) %>% 
  group_by(str_extract(name, "\\d{3}$"), id) %>% 
  mutate(value = if_else(row_number() == 3 & first(value) == "0" &
         nth(value, 2) == "no", "FLAG", value)) %>% 
  ungroup %>% select(name, value) %>% 
  pivot_wider(names_from = name, values_from = value, values_fn = list) %>% 
  unnest(cols = everything()) %>% type.convert(as.is = TRUE)

#> # A tibble: 3 × 6
#>   player001 player002 lossallowed001 loseallowed002 flag001 flag002
#>       <int>     <int> <chr>          <chr>          <lgl>   <chr>  
#> 1         1         1 no             no             NA      <NA>   
#> 2         0         0 yes            no             NA      FLAG   
#> 3         3         5 no             yes            NA      <NA>

score 1 · Answer 4 · answered Jan 10 '22 at 19:17

Here's a solution in the tidyverse. While I arrived at this solution independently, this is likely a duplicate of @camille's solution here, which was posted shortly before mine.

library(tidyverse)

# ...
# Code to generate 'df'.
# ...

df %>%
  # Index the matches.
  mutate(match_id = row_number()) %>%
  
  # Pivot to get a row for each player {001, 002, ...} and match.
  pivot_longer(
    # Target columns whose names end with a separate suffix of 3+ digits.
    matches("^(.*\\D)(\\d{3,})$"),
    names_pattern = "^(.*\\D)(\\d{3,})$",
    # Index the players by their suffixes; and give each the following three columns:
    # 'player' (score), 'lossallowed', and 'flag'.
    names_to = c(".value", "player_id")
  ) %>%
  
  # Flag the appropriate cases.
  mutate(
    flag = if_else(player == 0 & lossallowed == "no", "FLAG", NA_character_)
  ) %>%
  
  # Return to original, wide format.
  pivot_wider(
    names_from = player_id,
    values_from = !c(match_id, player_id),
    names_glue = "{.value}{player_id}"
  ) %>%
  arrange(match_id) %>% select(!match_id)

Fill cells dataframe based on multiple conditions

4 Answers4