Arrange and combine values in table in R by unique identifier

Question

I have the following data:

ID<-c(001,002,003,003,004,005)
Email<-c("tom@abc.com","jane@abc.com","jim@abc.com","jim@abc.com","tom@abc.com","mike@abc.com")
df<-as.data.frame(cbind(ID,Email))

I want to create a table where the ID numbers for each person's email address will be shown in table format.

Email           IDs
jane@abc.com    002
jim@abc.com     003
mike@abc.com    005
tom@abc.com     001, 004

I've tried an apply function, tapply(df$ID,df$Email, FUN=length, but am only getting non-unique count.

jane@abc.com    1
jim@abc.com     2
mike@abc.com    1
tom@abc.com     2

DanY · Answer 1 · 2018-07-26T21:15:58.743

3

With a data.table, this is simple:

df <- data.frame(
    id = c("001","002","003","003","004","005"),
    email = c("tom@abc.com","jane@abc.com","jim@abc.com","jim@abc.com","tom@abc.com","mike@abc.com"),
    stringsAsFactors = FALSE
)

library(data.table)
setDT(df)
df[ , .(idlist = paste(unique(id), collapse = ", ")), by = email]

edited Jul 26 '18 at 21:15

answered Jul 26 '18 at 19:45

DanY

5,920
1
13
33

1

or `res <- dt[, .(IDS = paste0(ID, collapse = ", ")), Email]` if you are happy with string values – Bulat Jul 26 '18 at 19:47
1

@Bulat - I will edit my answer to use this as it's more directly what the OP asked for. Thanks! – DanY Jul 26 '18 at 19:51
@DanY the solution you provided collapses the IDs into one cell. However, I only want to return unique IDs in each cell. In your solution you'll see that tom@abc.com has "003, 003" as the result. The 003 should be listed once. – Danny Jul 26 '18 at 20:25
Just wrap `id` with `unique()`. I've updated my answer to now include this. – DanY Jul 26 '18 at 21:15

Arrange and combine values in table in R by unique identifier

1 Answers1