0

I'm setting ID of database entry's by creating and using a hash value. The problem is that when I start the application again, the hash value of the same origin values are different and I get a doubling (same values, different ID). Below you find my example code. Start the CLI, remember the given hash value, start again --> different value.

How can I reproduce the same hash value with each instance?

static void Main(string[] args)
{
    int drid = 3081;
    DateTime dt = DateTime.ParseExact("2019-04-11 00:23:10", "yyyy-MM-dd HH:mm:ss", null);
    string idAsString = drid.ToString() + dt.ToString();
    Console.WriteLine(idAsString.GetHashCode().ToString());
    Console.ReadKey();
}
Camilo Terevinto
  • 31,141
  • 6
  • 88
  • 120
Frank Mehlhop
  • 1,480
  • 4
  • 25
  • 48
  • I too always have the same value `745858392`. Howevever, I suggest you change `dt.ToString()` to something like `dt.ToString("s")` or [another culture invariant date format](https://learn.microsoft.com/en-us/dotnet/standard/base-types/standard-date-and-time-format-strings) – Justin Lessard Apr 11 '19 at 17:09
  • 1
    @CamiloTerevinto It is specifically documented as not being consistent across different executions of the application. – Servy Apr 11 '19 at 17:11
  • This blog post might help: [Why is string.GetHashCode() different each time I run my program in .NET Core?](https://andrewlock.net/why-is-string-gethashcode-different-each-time-i-run-my-program-in-net-core/) – IronGeek Apr 11 '19 at 17:28
  • This question is a duplicate of I think my answer I found there: https://stackoverflow.com/questions/5154970/how-do-i-create-a-hashcode-in-net-c-for-a-string-that-is-safe-to-store-in-a – Frank Mehlhop Apr 12 '19 at 06:42

1 Answers1

0

Do not do this. Never ever ever. Hash codes are far not unique and semantically different objects at some point will result to a collision - i.e. both will produce exactly the same hashcode, which, in turn, will break down your database.

Next thing to notice is that hashing objects is complicated. It is good to keep hashcode the same as long as your object lives, which implies that you compute it once - when the object is created - or compute it on the fly ignoring mutable fields (since generally each mutation will change the hashcode).

Even more than that: it isn't easy to come up with good enough hashing algorithm: the one which is unpredictable enought and yet collision-rare.

Zazaeil
  • 3,900
  • 2
  • 14
  • 31
  • While this is correct, using something like SHA2-256 or SHA2-512 (.NET does't have SHA3 yet) would make collisions far less likely – Camilo Terevinto Apr 11 '19 at 17:11
  • @CamiloTerevinto, it does not change the thing. Id is unique, hashes are not. – Zazaeil Apr 11 '19 at 17:13
  • "Even more than that: it isn't easy to come up with good enough hashing algorithm: the one which is unpredictable enought and yet collision-rare." It's not that hard. Avoiding collisions mostly comes down to just increasing the size of the hash, as long as the hashing algorithm isn't truly bad. Hashing to an `int` is never going to have a low collision rate for any non-trivial collection size, but it doesn't need to be *that* big for the odds of a collision to be low enough that the sun is more likely to explode before a collision happens, which tends to be good enough for most purposes. – Servy Apr 11 '19 at 17:15
  • @Servy, we are not discussing hashing algorithm itself, rathether the way you implement `GetHashCode()` on a particular (complicated?) object. – Zazaeil Apr 11 '19 at 17:27
  • @SerejaBogolubov They're asking how to create a hash of an object to store in a database. They're not asking to `GetHashCode`, which is used for storing objects in hash-based in-memory collections. – Servy Apr 11 '19 at 17:30
  • @Servy, pls check carefully which methods are being called. More than that - such an approach sooner or later will result to a `GetHashCode()` implementation problem explicitly. So your contrargument seems unreasonable to me. – Zazaeil Apr 11 '19 at 17:32
  • 1
    @SerejaBogolubov No, you do *not* in fact need to inevitable call `GetHashCode` to compute a hash of a string. You can *absolutely* write code to compute a hash without using it. That you think it's impossible to write any hashing algorithm besides `string.GetHashCode` makes no sense. Why do you think that's the only possible way to hash anything? – Servy Apr 11 '19 at 17:34