Aggregate R data frame over count of a field: Pivot table-like result set

Question

I have a data frame in the following structure

ChannelId,AuthorId
1,32
28,2393293
2,32
2,32
1,2393293
31,3
3,32
5,4
2,5

What I want is

AuthorId,1,2,3,5,28,31
4,0,0,0,1,0,0
3,0,0,0,0,0,1
5,0,1,0,0,0,0
32,1,2,0,1,0,0
2393293,1,0,0,0,1,0

Is there a way to do this?

@StevenBeaupré I do not have an idea to pass ChannelId's as header. I played around aggregate, dplyr.count and count but no luck. — Bedi Egilmez, Jul 15 '16 at 20:24

score 5 · Accepted Answer · answered Jul 15 '16 at 20:55

The xtabs function can be called with a formula that specifies the margins:

 xtabs( ~ AuthorId+ChannelId, data=dat)

         ChannelId
AuthorId  1 2 28 3 31 5
  2393293 1 0  1 0  0 0
  3       0 0  0 0  1 0
  32      1 2  0 1  0 0
  4       0 0  0 0  0 1
  5       0 1  0 0  0 0

Steven Beaupré · Answer 2 · 2016-07-15T20:41:18.997

Perhaps the simplest way would be: t(table(df)):

#         ChannelId
#AuthorId  1 2 3 5 28 31
#  3       0 0 0 0  0  1
#  4       0 0 0 1  0  0
#  5       0 1 0 0  0  0
#  32      1 2 1 0  0  0
#  2393293 1 0 0 0  1  0

If you want to use dplyr::count you could do:

library(dplyr)
library(tidyr)

df %>%
  count(AuthorId, ChannelId) %>% 
  spread(ChannelId, n, fill = 0)

Which gives:

#Source: local data frame [5 x 7]
#Groups: AuthorId [5]
# 
#  AuthorId     1     2     3     5    28    31
#*    <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1        3     0     0     0     0     0     1
#2        4     0     0     0     1     0     0
#3        5     0     1     0     0     0     0
#4       32     1     2     1     0     0     0
#5  2393293     1     0     0     0     1     0

Shouldn't you load at least one of `dplyr` or `tidyr` for `%>%` to work? — Sumedh, Jul 15 '16 at 20:40

score 2 · Answer 3 · answered Jul 16 '16 at 00:59

We can also use dcast from data.table. Convert the 'data.frame' to 'data.table' and use dcast with the fun.aggregate as length.

library(data.table)
dcast(setDT(df1), AuthorId~ChannelId, length)
#   AuthorId 1 2 3 5 28 31
#1:        3 0 0 0 0  0  1
#2:        4 0 0 0 1  0  0
#3:        5 0 1 0 0  0  0
#4:       32 1 2 1 0  0  0
#5:  2393293 1 0 0 0  1  0

Aggregate R data frame over count of a field: Pivot table-like result set

3 Answers3