I have a long data frame that i want to widen, but one key has two different values:
df <- data.frame(ColA=c("A", "B", "B", "C"), ColB=letters[23:26])
ColA ColB
1 A w
2 B x
3 B y
4 C z
I want my output to be a paste of the two values for this key together:
ColA ColB
1 A w
2 B xy
3 C z
A regular pivot_wider()
will throw a warning and convert the values to lists:
df.wide <- df %>%
pivot_wider(names_from=ColA, values_from=ColB)
Warning message:
Values are not uniquely identified; output will contain list-cols.
* Use `values_fn = list` to suppress this warning.
* Use `values_fn = length` to identify where the duplicates arise
* Use `values_fn = {summary_fun}` to summarise duplicates
# A tibble: 1 x 3
A B C
<list> <list> <list>
1 <chr [1]> <chr [2]> <chr [1]>
Based on the warning it looks like pivot_wider()
with a value_fn()
is similar to what I want as an intermediate step:
# intermediate step
df.wide <- df %>%
pivot_wider(names_from=ColA, values_from=ColB, values_fn=SOMETHING)
A B C
1 w xy z
But it looks like values_fn()
only takes summary functions, and not something that would work on character data (like paste()
)
The closest I can get is:
df %>%
pivot_wider(names_from=ColA, values_from=ColB, values_fn=list) %>%
mutate(across(everything(), as.character)) %>%
pivot_longer(cols=everything(), names_to="ColA", values_to="ColB")
# A tibble: 3 x 2
ColA ColB
<chr> <chr>
1 A "w"
2 B "c(\"x\", \"y\")"
3 C "z"
With an additional mutating gsub()
-type function. Surely there's an easier way! Preferably within the tidyverse
, but also open to other packages.
Thanks