From your description, it sounds like you're just looking for aggregate
. Consider the following:
> df = data.frame(user_id = c(1,2,2,3),
+ advertiser_id = c(1:4),
+ other_data = letters[c(1, 2, 2, 3)])
> df
user_id advertiser_id other_data
1 1 1 a
2 2 2 b
3 2 3 b
4 3 4 c
> aggregate(advertiser_id ~ . , df, I)
user_id other_data advertiser_id
1 1 a 1
2 2 b 2, 3
3 3 c 4
The above converts the "advertiser_id" column into a list
, as can be inspected using str
. This might be convenient, but might also be difficult to work with, for instance if you wanted to save your output to a csv file later on.
> str(aggregate(advertiser_id ~ . , df, I))
'data.frame': 3 obs. of 3 variables:
$ user_id : num 1 2 3
$ other_data : Factor w/ 3 levels "a","b","c": 1 2 3
$ advertiser_id:List of 3
..$ 0:Class 'AsIs' int 1
..$ 4:Class 'AsIs' int [1:2] 2 3
..$ 8:Class 'AsIs' int 4
A less flexible alternative is to concatenate the "advertiser_id" columns as a character string.
> aggregate(advertiser_id ~ . , df, paste, collapse = ", ")
user_id other_data advertiser_id
1 1 a 1
2 2 b 2, 3
3 3 c 4
> str(aggregate(advertiser_id ~ . , df, paste, collapse = ", "))
'data.frame': 3 obs. of 3 variables:
$ user_id : num 1 2 3
$ other_data : Factor w/ 3 levels "a","b","c": 1 2 3
$ advertiser_id: chr "1" "2, 3" "4"
Both of these can also easily be done with data.table
, along the lines of @eddi's answer.