0

I am trying to anonymise data in my spreadsheet.

How can I take a name and turn it into a randomly generated, say, 6 digit code?

One caveat is that my dataset contains contacts more than once. How can I handle the duplicates, with them being assigned the same string as a previous entry?

Help appreciated.

Best Regards

Lloyd

I am looking for the following:

Col A              | Col B
foo.bar@gmail.com   124556
bar.foo@gmail.com   445643
foo.bar@gmail.com   124556
Lloyd Rees
  • 55
  • 2
  • 3
  • 9
  • I think this might be helpful here: http://stackoverflow.com/questions/15201297/md5-hash-function-in-excel-without-using-vba – Elisha Sep 18 '14 at 11:22
  • 1
    Does the 6 digit code have to be constant? i.e. would a random number with duplicate checking be sufficient? – Bathsheba Sep 18 '14 at 11:28
  • This (earlier) Q&A might be better: **[Is there an Excel function to create a hash value?](https://superuser.com/questions/550592/is-there-an-excel-function-to-create-a-hash-value)**; maybe this one could be marked as a duplicate, hence (even though it's on SU and not on SO)? – nutty about natty Aug 08 '18 at 08:17

2 Answers2

1

Here is a macro-based solution.
The first part fills column B with non-repeating 6-digit values.
The second part resolves duplicates in column A

Sub SecretCoder()
    Dim col As Collection, i As Long, N As Long, _
        j As Long, v As String, codee As String, _
        ncodee As Long, wf As WorksheetFunction
    Set col = New Collection
    Set wf = Application.WorksheetFunction
    N = Cells(Rows.Count, "A").End(xlUp).Row
    i = 1

    'First Part

    On Error Resume Next
    While col.Count < N
        DoEvents
        ncodee = wf.RandBetween(100000, 999999)
        col.Add ncodee, CStr(ncodee)
        If Err.Number = 0 Then
            Cells(i, 2).Value = ncodee
            i = i + 1
        Else
            Err.Number = 0
        End If
    Wend
    On Error GoTo 0

    'Second Part

    For i = 2 To N
        v = Cells(i, 1).Text
        For j = 1 To i - 1
            If v = Cells(j, 1).Value Then
                Cells(i, 2).Value = Cells(j, 2).Value
                GoTo donV
            End If
        Next j
donV:
    Next i

End Sub
Gary's Student
  • 95,722
  • 10
  • 59
  • 99
0

This technique was originally described here by Excellll on SuperUser.

You could simply take a copy of your email address column on a new tab, remove duplicates, then in the column adjacent, enter

=RAND()

In a third column, enter

=RANK(B1,$B$1:$B$999)+100000

There is a negligible chance that of the random floating point decimals produced by RAND, any two numbers will be the same. Ranking them produces integers, to which you can add 100000 (or any 6-digit figure) to make the code 6 digits.

Back in your original table you can simply VLOOKUP against your new table.

=VLOOKUP(A2,NewSheet!$A$1:$C$999,3,0)

Note the 6-digit codes will change each time you recalculate the workbook. However, you can get round this by simply copying your new table, then pasting special as values to fix those codes permanently.

Community
  • 1
  • 1
Andi Mohr
  • 458
  • 1
  • 6
  • 22