I want to transforms a data frame with contact information with of a for a list of municipalities in which similar information such as e.g. phone number appears in multiple columns.
I have tried using both reshape2::dcast()
as well as tidyr::spread()
, neither of which solves my problem. I have also checked other post of stack overflow e.g.
Have yet to find a solution which works. It seems to me that the problems should be fairly straightforward (and solvable with spread or dcast).
tmp <- tibble(municipality = c("M1", "M2"),
name1 = c("n1", "n2"), name2 = c("n3", "n4"), name3 = c(NA, "n5"), # placeholder names
phone1 = c("p1", "p2"), phone2 = c("p3", "p4"), phone3 = c(NA, "p5")) # placeholder phone numbers
#solution 1
tmp %>% gather("colname", "value", -municipality) %>%
filter(municipality == "M1") %>% #too simplify, should be replaced with group_by(municipality)
na.omit() %>% mutate(colname = str_replace(colname, "\\d", replacement = "")) %>%
spread(., key = "colname", value = "value")
#Solution 2
tmp %>% gather("colname", "value", -municipality) %>%
filter(municipality == "M1") %>% # same as above
na.omit() %>% mutate(colname = str_replace(colname, "\\d", replacement = "")) %>%
dcast(municipality + value ~colname)
Solution 1 results in the following error: Error: Each row of output must be identified by a unique combination of keys.
Solution 2 results in the following data frame (which is the desired result except it needs to be collapsed):
municipality value name phone
1 M1 n1 n1 <NA>
2 M1 n3 n3 <NA>
3 M1 p1 <NA> p1
4 M1 p3 <NA> p3