I want to assign the same unique ID's to competitors in both of the below data frames (master.treeDQ2 and rank_table).
The master.treeDQ2 data frame has two competitor names in each row, while rank_table only has one in each row.
I would like to assign a unique ID to each competitor based on their name and the gym they train at. This is to avoid assigning the same ID to different people with the same name.
In the master.treeDQ2 df, some people in the comp_01name column also appear in the comp_02name column, while others only appear in one or the other.
I would like R to do the following:
- give the same ID's to people that appear in both comp_01name and comp_02name.
- assign ID's in comp_02name not already used in comp_01name.
- assign the same ID's used for each competitor in master.treeDQ2 to the corresponding competitor in rank_table.
Is there maybe a loop or a function I can apply to get this done?
I saw something similar to what I need here, but it only works with two columns. So I'm stuck at only having unique ID's for competitor 1.
What I have:
rank_table = read.csv('https://raw.githubusercontent.com/bandcar/Examples/main/rank_table_complete2.csv')
master.treeDQ2 = read.csv('https://raw.githubusercontent.com/bandcar/Examples/main/master.treeDQ.edited_draft6.csv')
# ASSIGN ID'S by name
# competitor 1
master.treeDQ2$ID1 <- cumsum(!duplicated(master.treeDQ2[,c(11,12)]))
Simplified version of what I want to achieve for the master data frame:
comp01 gym1 ID1 comp02 gym2 ID2
A w 1 D z 4
A w 1 D z 4
B x 2 A w 1
B x 2 D z 4
B x 2 D z 4
C y 3 A w 1
C y 3 B x 2
Simplified version of what I want to achieve in rank_table:
competitor ID
C 3
D 4
D 4
A 1
B 2
B 2