I have a dataset of products with two columns representing classifications. I would like to obtain a group id based on the union of the two sets.
The group id has to be transitive in the sense that if class1 is the same for observations 1 and 2, and class2 is equal for 2 and 3, then 1,2, and 3 are equal. In the example, you can see transitivity working in the result where columns 1-4 have the same group_id.
Any tips on how to do it would be appreciated =)
# Example
df <- tribble(
~id, ~class1, ~class2,
1, "A", "L1",
2, "A", "L1",
3, "B", "L1",
4, "B", "L2",
5, "C", "L3",
6, "D", "L4")
# Desired output
result <- tribble(
~id, ~class1, ~class2, ~group_id,
1, "A", "L1", 1,
2, "A", "L1", 1,
3, "B", "L1", 1,
4, "B", "L2", 1,
5, "C", "L3", 2,
6, "D", "L4", 3)