0

I am working on Xamarin.Forms project and users can create accounts, each account gets a unique id assigned when created. Should I be using a specific method for creating the id?

I am using

$id = hash(sha256, $now. $birthday);

Is there a chance there can be a duplicate generated? Should I be using a different method? Or is this fine?

Edit:

$id = hash(sha256, $date . $birthday);
 if(DB::query('SELECT * FROM users WHERE id=:id', array(':id' => $id))){
    $id = hash(sha256, $date . $birthday . rand(100, 999));
 if(DB::query('SELECT * FROM users WHERE id=:id', array(':id' => $id))){
    echo 'try_again';
    return;
 }
}
Dan
  • 1,100
  • 2
  • 13
  • 36

3 Answers3

2

Is there a chance there can be a duplicate generated? Should I be using a different method?

There is always a chance you get the same value, as there is nothing preventing two people, born the same day, to trigger your code in one second window. So while sha256 is pretty fine, the way you use it is not, as 1 second is still long enough to hit you. If you want to ensure you do not have the same ID assigned to different users, then you should always check (i.e. by looking into your DB) if given ID is not already used. Also your DB schema should use UNIQUE attribute for that column to prevent duplicated ID from being inserted.

Marcin Orlowski
  • 72,056
  • 11
  • 123
  • 141
  • If it checks and it exists already what should I do? Add their birthday or something and how do I check if the 2nd attempt is also created, should I use a loop? – Dan Mar 25 '18 at 22:12
  • In your case then either just wait 1 second (to ensure `$now` is different) or add some i.e. random value. Then hash again. Sure, loop is fine. Just loop i.e. 10 times. Chances you will get 10 dupes are basically 0, still. so as fallback, after 10 unsuccessful rotations just report error i.e. "Registration failed, Please try again". And let user retry – Marcin Orlowski Mar 25 '18 at 22:12
  • I edited in my updated code, would you say the updated code is correct? – Dan Mar 25 '18 at 22:35
  • no, it is worse now. SHA hashes are alwayss the same length. You must hash each time again, not concatenate (**predicable**) value to it, because then you will have different length IDs in your system. That's bad. Not only for the quality of your database structure. Do ordinary `for` loop 10 times and in that loop generate hash, then check it already known. If it is, repeat the loop. If you reach 10 then you stil got no id known, so you know you should now report error to user – Marcin Orlowski Mar 25 '18 at 22:43
  • I'm confused by your comment, a random number is added to the string which is then hashed so it is always 64 characters long, what do you mean different lengths? – Dan Mar 25 '18 at 23:27
1

Be cool.

var Id = Guid.NewGuid().ToString();
Nick Kovalsky
  • 5,378
  • 2
  • 23
  • 50
  • How can I be sure it is unique since I am not generating is based off the timestamp or username – Dan Mar 29 '18 at 01:10
  • I was reading another post and I read that `If you have a lot of records, and a clustered index on a GUID, your insert performance will SUCK, as you get inserts in random places in the list of items (thats the point), not at the end (which is quick)`, how much will it impact my insert performance? Is it very noticeable? [Post](https://stackoverflow.com/questions/45399/advantages-and-disadvantages-of-guid-uuid-database-keys) – Dan Mar 29 '18 at 06:39
  • i have 2 fields for this purpose for a record usually. 1 - int id, database identity generated - "insert performance won't suck". i dont use this id field in my logic, its for database purposes only. 2 - string key - unique guid or whatever i want, for records to be database-independent. this key is used as id when item is exported to the real world. i hope i was clear. =) – Nick Kovalsky Mar 29 '18 at 06:47
  • My problem is if I use int ids that are auto-incrementing by the database it will most likely run out of numbers since it's for users (and posts) and users can have multiple posts which add up, what do bigger companys use to store post/user ids? – Dan Mar 29 '18 at 06:58
  • Since int32 max value is 2,147,483,647 and if you intend to have more than this in one single table its bad design of tables architecture.. – Nick Kovalsky Mar 29 '18 at 07:01
  • How would I go about having more than that? Any suggestions tutorials or articles? – Dan Mar 29 '18 at 07:04
  • would just split data into different tables. why push it up to make a search in a single table with more than 2 billion records... insane. – Nick Kovalsky Mar 29 '18 at 07:21
  • When loading a post though would I have to search all the different post tables, is it possible with one query? And how do i decide which table to put it into? – Dan Mar 29 '18 at 07:24
  • ofc you'll search in one table: a post will belong to a section, thread, subject, country, year, keywords, whatever scheme you choose to split data into different segments/tables. – Nick Kovalsky Mar 29 '18 at 07:44
0

There's no need to use timestamp or birthday, a random value will do:

hash('sha256', random_bytes(32));

The odds of a duplicate value is so low that you don't even need a loop. If you handle possible errors, that's enough.

Max Oriola
  • 1,296
  • 12
  • 8