I want to pass all values in a dataframe as condition to dplyr::case_when() with stringr::str_detect() while using the respective column title als replacement value.
I have these two data frames:
> print(city_stack)
# A tibble: 11 × 1
city
<chr>
1 Britz
2 Berlin-Reinickendorf
3 Berlin-Kladow
4 Berlin-Spindlersfeld
5 Berlin-Mahlsdorf
6 Berlin-Lichterfelde
7 Berlin-Spandau
8 Berlin-Biesdorf
9 Berlin-Niederschöneweide
10 Rüdersdorf bei Berlin
11 Berlin-Nordend
> print(districts_stack)
# A tibble: 10 × 2
Berlin Köln
<chr> <chr>
1 Adlershof Rodenkirchen
2 Altglienicke Chorweiler
3 Baumschulenweg Ehrenfeld
4 Biesdorf Kalk
5 Blankenburg Lindenthal
6 Blankenfelde Mülheim
7 Bohnsdorf Nippes
8 Britz Porz
9 Buch Kölner Zoo
10 Buckow Universität zu Köln
I tried using a nested for loop:
for (i in colnames(districts_stack)){
for (j in districts_stack[[i]]){
mutate(city_stack, case_when(
str_detect(city, paste0(j) ~ i,
TRUE ~ city)
)
}
}
While that totally works, this is extremely inefficient and gets problematic with the huge dataframe I am actually working with. I feel like there should be a more efficient solution using purrr::map(), but I wasn't able to come up with anything working.
dput() of the dataframes:
dput(city_stack[1:11,])
structure(list(city = c("Britz", "Berlin-Reinickendorf", "Berlin-Kladow",
"Berlin-Spindlersfeld", "Berlin-Mahlsdorf", "Berlin-Lichterfelde",
"Berlin-Spandau", "Berlin-Biesdorf", "Berlin-Niederschöneweide",
"Rüdersdorf bei Berlin", "Berlin-Nordend")), row.names = c(NA,
-11L), class = c("tbl_df", "tbl", "data.frame"))
> dput(districts_stack[1:10,1:2])
structure(list(Berlin = c("Adlershof", "Altglienicke", "Baumschulenweg",
"Biesdorf", "Blankenburg", "Blankenfelde", "Bohnsdorf", "Britz",
"Buch", "Buckow"), Köln = c("Rodenkirchen", "Chorweiler", "Ehrenfeld",
"Kalk", "Lindenthal", "Mülheim", "Nippes", "Porz", "Kölner Zoo",
"Universität zu Köln")), row.names = c(NA, -10L), class = c("tbl_df",
"tbl", "data.frame"))