4

I need to store a social security number in the unique scrambled state...

The reason: I would require social numbers, but I do not want to store them open in case if database gets compromised.

I want to convert Social Security number into string of alphanumerics and I prefer this to be a one-way process.(not reversible)

Then, when I search for existing SSN numbers, I would use the same algorithm again for user-input, scramble the SSN and will search the database using alphanumeric string.

In php, I could do something like that

function maskSSN($SSN) {
    $salt = sha1(md5($SSN));
    $SCRAM = md5($SSN . $salt);
    return $SCRAM;
}

But I do not think that would produce unique values

Michael
  • 41,989
  • 11
  • 82
  • 128
Andrew
  • 7,619
  • 13
  • 63
  • 117
  • Any hash function has collisions. What format SSN number have? – Maxim May 09 '13 at 19:05
  • Well, EACH SSN number is unique and has 9 digits – Andrew May 09 '13 at 19:06
  • 2
    It's easy to bruteforce in any form when you know hashing function. On 9-digit SSN collisions are unlikely. You can just bruteforce all SSN numbers and check result for uniqueness – Maxim May 09 '13 at 19:08

2 Answers2

4

With something with as little entropy as SSNs, I wouldn't recommend storing them unencrypted or hashed. It would be very feasible to brute force the SSNs if an attacker steals your database.

Instead you should encrypt the SSNs with AES-256 or better. Check out this SO question for more info about proper storage of the crypto key: Storing encryption keys -- best practices?

Community
  • 1
  • 1
Freedom_Ben
  • 11,247
  • 10
  • 69
  • 89
3

If you can store the full hash (not truncated) you shouldn't have any collisions with a 9 digit SSN using most secure hashes.

To keep the hashes from being brute forcible use HMAC-Sha1 or HMac-Sha256 with a secret key. Here is a related answer that involved phone numbers and anonymizing data https://stackoverflow.com/a/15888989/637783

An AES-256 result wouldn't be usable later with out decryption, as AES-256, properly and securely used, produces different results for the same input. However, it could be used reasonably in a relational table in which your ssn was encrypted and stored against a primary key which other tables are referencing the key instead.

The later option would allowed you to rotate your keys pretty simply too, over time.

Community
  • 1
  • 1
jbtule
  • 31,383
  • 12
  • 95
  • 128
  • But what if the database gets stolen? Then the HMAC is useless against brute-forcing. Note that an HMAC is only designed to preserve integrity and authentication, not confidentiality. – Freedom_Ben May 12 '13 at 14:53
  • With HMac, the database is stolen, you can NOT brute force unless you get the key as well, just as with AES if you have the key as well you just have the data. But, I think that's an inconsequential difference though since brute forcing ssns is only slightly less trivial then aes decryption. Also Hmac is designed as a [PRF](http://en.wikipedia.org/wiki/Pseudorandom_function), so it actually is designed to specifically not leak info about the plaintext. Also the OP is primarily concerned with searching, and AES (impl. securely) will be the least efficient in this regard by a lot. – jbtule May 13 '13 at 13:24