0
unsigned long hash(char *str) 
{
    unsigned long hash = 5381;
    int c;
    while ((c = *str++))
        hash = ((hash << 5) + hash) + c; /* hash * 33 + c */
    return hash % NUM_BUCKETS;
}

With this code, when you put in the function 20 letters(such as zzzzzzzzzzzzzzzzzzzzzzzzzzzz) you get an output of a huge number. how does the long hold the numbers if it is restricted to only 32 bits?

hessam hedieh
  • 780
  • 5
  • 18
  • `you get an output of a huge number` How _exactly_ do you check the size of the output? `printf(what here?)`? – KamilCuk Aug 28 '21 at 19:33
  • Does this answer your question? [djb2 Hash Function](https://stackoverflow.com/questions/2571683/djb2-hash-function) – Ruud Helderman Aug 28 '21 at 21:10

2 Answers2

0

You should first check that an unsigned long is 32 bits. If you're getting values above about (roughly) 4.2 billion, it's almost certainly wider than that(a).

You can check this by compiling and running the following program:

#include <limits.h>
#include <stdio.h>

int main(void) {
    printf("%d\n%zu\n", CHAR_BIT, sizeof(unsigned long));
    return 0;
}

The first value is the number of bits in a byte, the second the number of bytes in an unsigned long. Multiplying the two will therefore give you the number of bits in the unsigned long type.

On my system, I get 8 and 8, indicating a 64-bit size.


(a) The ISO C standard does not mandate an exact size for the original types found in C (though it may for things like uint32_t). In fact it doesn't directly even mandate the number of bits at all.

What is does mandate is the minimum range requirements which, for unsigned long is 0..4294967295 (the 4.2 billion I mentioned before).

However, an implementation is free to provide you with something larger, such as a 128-bit type, which would give you a range from zero up to about 1038, or a hundred million million million million million million.

As an aside, I could have used billions, trillions, or even quadrillions but:

  • there's sometimes disagreement as to that actual powers of ten they represent; and
  • the use of many "million" suffixes drives home the size more than a single rarely knows suffix like "undecillion" or "sextillion".
user3386109
  • 34,287
  • 7
  • 49
  • 68
paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
0

unsigned long is at least 32 bits, but it can be larger. It's a 64-bit type with most compilers on most 64-bit processors except under Windows. So a function returning unsigned long can return a value that's larger than 232.

However, the function you show is guaranteed to return a number in the range from 0 to NUM_BUCKETS inclusive. If you see a value that's larger than NUM_BUCKETS, what you're seeing is not a value returned by this function. Maybe there's a bug in your code. Make sure that you've enabled all reasonable warnings on your compiler and that you've resolved them correctly (not by blindly adding a cast). If you still don't understand your program's output, use a debugger and inspect intermediate values. If you still don't understand what your program is doing, you can ask online, with complete code that reproduces the problem.

Gilles 'SO- stop being evil'
  • 104,111
  • 38
  • 209
  • 254