1

I am giving the first steps with coding and with R and i have a problem:

I have one data frame with this format:

Months| Person

April Person1

May Person2

April Person1

June Person 3

May Person4

and i want this output:

May - Person2, Person4

April - Person1

June - Person3

I am using unique(df$months) and i get the unique months but i can't obtain the persons.

I was thinking save the index of each unique(df$months) and select the "person" of these index... this for each unique(df$months). But this doesn't seems "optimal" or good practice.

Anyone can help me?

David Arenburg
  • 91,361
  • 17
  • 137
  • 196
RookieSun
  • 11
  • 3

1 Answers1

3

You could use aggregate (from base R) after removing the "duplicate" rows (unique(df)) and use (toString) to paste together the unique "Person" grouped by "Months". toString is a wrapper for paste(., collapse=', ')

aggregate(.~Months, unique(df), toString)
#   Months           Person
#1  April          Person1
#2   June          Person3
#3    May Person2, Person4

Or the same can be done in data.table by first converting to "data.table" (setDT),removing duplicates etc...

library(data.table)
 unique(setDT(df))[,list(Person=toString(Person)) , Months]
akrun
  • 874,273
  • 37
  • 540
  • 662