2

From reading Stackoverflow it seems that simply using a UUID for confirming registration via email is bad. Why is that? Why do I need to fancily generate some less random code from a user's data?

The suggested ways seem to be a variant of users data + salt -> hash. When a UUID is used, it always gets hashed. Why is that? There isn't anything to hide or obfuscate, right?

Sorry if this question is stupid.

Right now I am (prototyping) with Python3's UUID builtins. Is there something specially special about them?

Mörkö
  • 211
  • 1
  • 14

1 Answers1

3

The point of email verification is, that someone with malicious intentions is prevented from registering arbitrary email addresses without having access to their respective inboxes (just consider a prankster who wants to sign up a target for daily cat facts, or more sinister signing up someone for a paid email newsletter or spamming their inbox, potentially their work inbox, with explicit content). Thus the confirmation code, which must be cryptographically secure. One important feature of a cryptographically secure confirmation code is, that it can not be predicted or guessed.

This is why UUIDs are not suitable: The main feature of UUIDs is, that a collision is astronomically unlikely. However the UUID generation algorithm is not designed to not be predictable. Typically a UUID is generated from the generating systems MAC address(es), the time of generation and a few bits of entropy. The MAC address and the time are well determined. The use of a PRNG that's fed simply by PID and time is also perfectly permissible. The whole point of UUIDs is to avoid collisions, not to make them unpredictable or unguessable. For that it suffices to have bits that are unique to the generating system (that never change) and a few bits that prevent this particular system from generating the same UUID twice simply by distributing UUIDs in time, the process generating it and the process internal state.

So if I know which system is going to generate a UUID, i.e. know its MAC addresses, the time at which the UUID is generated, there are only some extra 32 or so bits of entropy that randomize the UUID. And 32 bits simply doesn't cut it, security wise.

Assuming that a confirmation token is valid for 24 hours one can >100 confirmation requests per second and the UUID generator has 32 bits of extra randomness (in addition to time and MAC, which we assume as well known) this gives a 2% chance of finding a valid confirmation UUID.

Note that you can not "block" confirmation requests if too many invalid UUIDs are attempted per time interval, as this would effectively give an attacker a DoS tool to prevent legitimate users from confirming their email addresses (also including the email address into the confirmation request doesn't help; this just allows to target specific email addresses for a DoS).

datenwolf
  • 159,371
  • 13
  • 185
  • 298
  • Ok, so to sum up: UUIDs don't overlap, but they are predictable because of deterministic info being baked into them? – Mörkö May 08 '16 at 20:36
  • Follow up: why use user data as a seed? Wouldn't [insert here method to use as random numbers as possible] be a more legitimate aproach? – Mörkö May 08 '16 at 20:38
  • 1
    Whoops, already asked here. http://stackoverflow.com/questions/23711489/e-mail-verification-with-keys-made-with-uuid-uuid4-safety-and-uniquness-of-gen – Mörkö May 08 '16 at 20:40