I am trying to match all customers in data set and give similar id to those whose zip code match 100%, address and email match 85%. I was able to do this using Record Linkage package
in R
. Now I have result like this:
x <- data.frame(ID1=c(1,2, 3, 5, 10, 11, 12), ID2=c(2,5,4,11,11,18,18))
ID1 ID2
1 2
2 5
3 4
5 11
10 11
11 18
12 18
But i want to group together all IDs which match like 1,2,5,11,10,12,18
are all same so i would like to give them same id.
Basically I want output like this:
Group Key
1 1
1 2
1 5
1 11
1 10
1 12
1 18
3 3
3 4