0

I have quite a simple program to compute the CRC32 of input strings from stdin and for some reason I am not getting the CRC32, but the CRC32B.

Here is my code

int main( int argc, char *argv[] ) { 

  unsigned long crc=0L;
  unsigned char *stdinput = malloc(1024);

  crc = crc32( 0L, Z_NULL, 0 );

  fgets(stdinput, 1024, stdin);

  crc = crc32( crc, stdinput, strlen(stdinput) - 1 );
  printf("%s 0x%08x\n", stdinput, crc );

}

I know there are overflow problems in the program, but that's not necessarily my issue.

The problem is the output is like so

echo test | ./crc32 results in 0xd87f7e0c

and not 0xaccf8b33 Verified here https://www.tools4noobs.com/online_tools/hash/

The output from zlib is definitely using CRC32B and not CRC32.

How would I modify this so I get the correct output?

I'm running this on a Debian 64 bit machine.

Any help with this would be greatly appreciated. Thank you.

sepp2k
  • 363,768
  • 54
  • 674
  • 675
TyrantUT
  • 29
  • 1
  • 5
  • 1
    Zlib use whatever CRC algorithm it choose to implement. If you want a different algorithm you'll need to use a different implementation. If you need a specific algorithm you'll need to ensure that implementation uses that specific algorithm. – Ross Ridge Aug 17 '16 at 20:33
  • 2
    There are at least five different CRC32 polynomials in relatively common use, and if you add to that different schemes for padding and endianess, maybe 10-20 different ways to get a "crc32" on an identical block of data. – Brian McFarland Aug 17 '16 at 20:34
  • `unsigned long` is not guaranteed a 32 bit type. Get use to `stdint.h` fixed width types (and for printing macros from `inttypes.h`). – too honest for this site Aug 17 '16 at 20:45
  • @RossRidge: Sure? Sounds pretty bad to me. That makes two versions potentially incompatible. – too honest for this site Aug 17 '16 at 20:47
  • Possibly related: [What is the difference between crc32 and crc32b?](http://stackoverflow.com/questions/15861058/what-is-the-difference-between-crc32-and-crc32b). – John Bollinger Aug 17 '16 at 21:54
  • @Olaf Obviously Zlib didn't choose any algorithm arbitrarily, thye chose to implement the algorithm used by gzip, so it can read and create `.gz` files that are compatible with gzip. But it not going to implement some other incompatible CRC algorithm for no reason. (Although it did invent its own Adler-32 checksum for it's own zlib format for performance reasons.) So if the original poster wants some other algorithm then they're going to have to find a different implementation. – Ross Ridge Aug 17 '16 at 22:22
  • @RossRidge You're trying to draw a distinction between zlib and gzip which doesn't really exist. zlib was derived from the gzip source code, by the same authors who created gzip in the first place. They're effectively one and the same. –  Aug 17 '16 at 22:36
  • This all sounds great, and thank you for the information but it doesn't really answer the question. If the website uses mhash then that is the code I need to use. Ill see if I can find some examples and code this up using that. – TyrantUT Aug 17 '16 at 23:03
  • I thought I did answer you question, but to be explicit, there's nothing you can do to modify your example code to generate the CRC value you expect using Zlib's `crc32` function. You need to use something else that implements the particular CRC algorithm you require. Also note that there's no universally accepted names for the many different 32-bit CRC variations. Any 32-bit CRC implementation can legitimately be called CRC32. – Ross Ridge Aug 17 '16 at 23:21
  • Right, I just don't know exactly which is used to create the one I mentioned. The website seems to be using the one I need, and it was mentioned mhash is the way to do it. So I have to find some example code that works. For some reason, the mhash library that I found is having problems finding their own header files... Weird stuff actually. – TyrantUT Aug 17 '16 at 23:28
  • @Olaf: an `unsigned long` is guaranteed to be _at least_ 32 bits, so it is perfectly fine to use for a 32-bit CRC. – Mark Adler Aug 18 '16 at 02:25
  • @TyrantUT: How do you know what 32-bit CRC you want? What is the application? Do you have a reference for the definition of the CRC? Your statement: "The website _seems_ to be using the one I need" does not engender confidence. – Mark Adler Aug 18 '16 at 02:28
  • @MarkAdler: Typically one wants to use that CRC for some serialised communication, so using the correct type from the start is a good idea. It also is more self-doucmenting to use an appropriate type (like one uses `size_t` for the result of `sizeof`). – too honest for this site Aug 18 '16 at 19:51
  • @Olaf: However `uint32_t` is not guaranteed to exist, whereas `unsigned long` is guaranteed to exist. So `unsigned long` is more portable, which is why zlib uses it for the `crc32()` return value. – Mark Adler Aug 18 '16 at 19:59
  • @MarkAdler: Hmm, fair point! I might have thought too many steps ahead. Changing to a recommendation. – too honest for this site Aug 18 '16 at 20:01

1 Answers1

1

The results you're getting from your code are correct.

The web site you're referencing is using the (obsolete) mhash library to hash user input, whose "crc32" implementation uses an uncommon polynomial typically only used for Ethernet checksums. The hash implemented by mhash as "crc32b" is, in reality, the one typically referred to as CRC32.

  • Do you know of a place I can get some working mhash code to work with? I keep getting header errors of functions not found from the example I have. – TyrantUT Aug 17 '16 at 23:18
  • Don't use mhash. It's old, buggy, and hasn't been updated since 2008. –  Aug 18 '16 at 02:19