Create a new column in dataframe containing variables of another column, based on dataframe subsets

Question

I have a dataframe (df) and am trying to add column z that contains a list of the qualitative elements from column y, but only the elements that are present when grouping the rows by column x.

df <- data.frame('x'=c("a","a","b","b"), 'y'=c("a","c","c","b"))
  x y
1 a a
2 a c
3 b c
4 b b
#Desired outcome;
df <- data.frame(x,y,'z'=c("a,c", "a,c", "c,b", "c,b"))
  x y   z
1 a a a,c
2 a c a,c
3 b c c,b
4 b b c,b

I know there are a bunch of questions here on how to add/create new columns in a dataframe, but I couldn't find any involving subsetting. I was thinking of using the dplyr package and filter() or mutate(), or aggregating the elements with aggregate(), but have had no success. My attempts:

library(dplyr)
z <- for (i in row.names(df)) {
  filter(df, x == unique(i))
  df[ ,3] <- levels(df$y)
}

z <- aggregate(x = df, by = as.list(df$x), FUN = levels)

Much thanks.

score 3 · Answer 1 · answered Apr 21 '20 at 19:27

We can paste after grouping by 'x'

library(dplyr)
df %>%
    group_by(x) %>%
     mutate(z = toString(y))
# A tibble: 4 x 3
# Groups:   x [2]
#  x     y     z    
#  <fct> <fct> <chr>
#1 a     a     a, c 
#2 a     c     a, c 
#3 b     c     c, b 
#4 b     b     c, b

aggregate returns a summarised output and if we need to create a column with base R, use ave

df$z <- with(df, ave(as.character(y), x, FUN = toString))

If we don't need that space after the , (toString == paste(., collapse=", "))

df$z <- with(df, ave(as.character(y), x, FUN = function(x) paste(x, collapse=",")))

Thanks akrun. Those both work, as well as the solutions for the other questions Henrik posted. — DHoog, Apr 22 '20 at 12:14

Create a new column in dataframe containing variables of another column, based on dataframe subsets

1 Answers1