I have a dataframe that I need to make unique IDs for. I was previously using the following code:
np.random.seed(1)
names = parent[['givenName', 'familyName']].agg(' '.join, 1).unique().tolist()
ids = np.random.randint(low=1e9, high=1e10, dtype=np.int64, size = len(names))
maps = {k:v for k,v in zip(names, ids)}
parent['sourcedId'] = parent[['givenName', 'familyName']].agg(' '.join, 1).map(maps)
I am running into an issue where I'm getting repeated IDs. I don't know if it's from people having the same name or what. I can't just assign numbers as the names could come in a different order every time. I also have the option to add parent['phone']
for the ID generation, but multiple people could have the same phone number, which is why I haven't been using it for ID generation up to this point. Any help would be appreciated.