I have a data frame of the following pattern:
df <- tibble(ID = c(1, 1, 1, 2, 2), key = c("a", "b", "b", "c", "c"), value = c("k1", "k3", "k1", "k2", "k5"))
ID key value
<dbl> <chr> <chr>
1 1 a k1
2 1 b k3
3 1 b k1
4 2 c k2
5 2 c k5
What I need is for each ID
, group together the rows where key
is equal and then encode the value
in a one-hot encoded manner accross all possible unique values of column value
. That is I want s.th. like
ID key k1 k2 k3 k5
<dbl> <chr> <dbl> <dbl> <dbl> <dbl>
1 1 a 1 0 0 0
2 1 b 1 0 1 0
3 2 c 0 1 0 1
I could provide a list of possible values like possible_values = c("k1", "k2", "k3", ...)
if this helps