2

I have a piece of code that is

// Bernstein hash
// http://www.eternallyconfuzzled.com/tuts/algorithms/jsw_tut_hashing.aspx           
ulong result = (ulong)s[0];
for ( int i = 1; i < s.Length; ++i ) 
{ 
    result = 33 * result + (ulong)s[i]; 
}
return (int)result % Buckets.Count; 

and the problem is that it's sometimes returning negative values. I know the reason is because (int)result can be negative. But I want to coerce it to be non-negative since it's used as an index. Now I realize I could do

int k = (int)result % Buckets.Count; 
k = k < 0 ? k*-1 : k;
return k; 

but is there a better way?

On a deeper level, why is int used for the index of containers in C#? I come from a C++ background and we have size_t which is an unsigned integral type. That makes more sense to me.

MethodMan
  • 18,625
  • 6
  • 34
  • 52
user6048670
  • 2,861
  • 4
  • 16
  • 20
  • 2
    Why not do the `%` before casting to `int`? Cast `Buckets.Count` to `ulong` instead if necessary. This will still limit you to 31-bit range, but at least it works properly :) As for your deeper question, indices aren't necessarily zero-based in .NET. It's perfectly legal to have an array that goes from -10 to +10. – Luaan Apr 10 '16 at 22:44

2 Answers2

2

Use

return (int)(result % (ulong)Buckets.Count);

As you sum up values you reach a positive integer number that cannot be expressed as a positive number in a signed 32 bit integer. The cast to int will return a negative number. The modulo operation will then also return a negative number. If you do the modulo operation first, you will get a low positive number and the cast to int will do no harm.

NineBerry
  • 26,306
  • 3
  • 62
  • 93
1

While you can find a way to cast this to an int properly, I'm wondering why you don't just calculate it as an int from the beginning.

int result = (int)s[0]; // or, if s[0] is already an int, omit the cast
for ( int i = 1; i < s.Length; ++i ) 
{ 
    result = 33 * result + (int)s[i]; 
}
return Math.Abs(result) % Buckets.Count;

As to why C# uses a signed int for indexes, it has to do with cross-language compatibility.

Community
  • 1
  • 1
Tim S.
  • 55,448
  • 7
  • 96
  • 122