In R: For a given data frame dummy
that looks like:
dummy <- data.frame(a = c("b01", "b01", "b02"),
id = c(456, 456, 233),
id2 = c(888, 888, 889),
t = c("neg", "no", "pos"),
j = c("no", "no", "no"),
y = c("pos", "no", "neg"),
q = c("pos", "no", "no"),
w = c("asd", "asd", "sdf"))
# a id id2 t j y q w
# 1 b01 456 888 neg no no pos asd
# 2 b01 456 888 no no pos no asd
# 3 b02 233 889 pos no neg no sdf
I want to merge the rows by columns a
, id
, and id2
but I only want to keep the corresponding neg
or pos
, when they appear in either of the rows, and no
if both are no
.
I've tried:
library(dplyr)
z <- dummy %>%
group_by(a, id, id2) %>%
summarise(
t = paste(t, collapse = "-"),
j = paste(j, collapse = "-"),
y = paste(y, collapse = "-"),
q = paste(q, collapse = "-")
And it will do it (after removing unwanted text with gsub
) but then column w is dropped..
The desired data frame would look like this:
# a id id2 t j y q w
# 1 b01 456 888 neg no pos pos asd
# 3 b02 233 889 pos no neg no sdf
Any help would be appreciated.
I've also looked at:
(Collapse text by group in data frame) and (dplyr summarise: Equivalent of ".drop=FALSE" to keep groups with zero length in output)