I am looking for a clean way to deidentify values in a csv file's column. The way I was working on was kind of a hack with counting values etc and wanted to see if a more realistic way to approach this. The input would be similar to:
One~Two~Three~Four
Test One~Failed for account 999999999, UserId: 7777777, Error: Duplicate nickname~8abc-964863c382d8~3/11/2021 1:03:43 PM
Test One~Failed for account 121212121, UserId: 3434343, Error: Duplicate firstname~zzzz-964863c382d8~3/11/2021 1:04:43 PM
Test One~Failed for account 565656565, UserId: 7878787, Error: Duplicate firstname~yyyy-964863c382d8~3/11/2021 1:05:43 PM
And need the output to be like this, replacing the userId and account number with x's:
One~Two~Three~Four
Test One~Failed for account XXXXXXXXX, UserId: XXXXXXX, Error: Duplicate nickname~8abc-964863c382d8~3/11/2021 1:03:43 PM
Test One~Failed for account XXXXXXXXX, UserId: XXXXXXX, Error: Duplicate firstname~zzzz-964863c382d8~3/11/2021 1:04:43 PM
Test One~Failed for account XXXXXXXXX, UserId: XXXXXXX, Error: Duplicate firstname~yyyy-964863c382d8~3/11/2021 1:05:43 PM
There would be other lines with data they may not have these types of lines, but I only want to identify these 2 types of values in the Two column. Any ideas would be appreciated!