0

Problem

I am trying to implement a hash function for a simple hash table using the crc32c instruction in sse4.2 x86 extension. However I am not that comfortable with these kinds of problems yet so I have some issues.

I have looked up that there is a function unsigned int _mm_crc32_u8 (unsigned int crc, unsigned char v) (source: https://software.intel.com/sites/landingpage/IntrinsicsGuide/#techs=SSE4_2&expand=1283 ) which takes unsigned char and returns the hash. According to my understanding the crc variable is there for error-checking purposes for leading bits which does not concern me (I can either set it to 0 or 0xffffffff and don't care about it).

However I have a string char *s and want to compute the hash of it.

Questions

(in order of importance)

  1. How to use _mm_crc32_u8 to hash the whole string? Should I hash all of the chars in s separately and bitwise XOR the results? I really have no clue.
  2. The function _mm_crc32_u8 takes unsigned chars. Is it OK to just cast unsigned char on char to convert it like (unsigned char)s[0] in this context? As far as I know it is because ASCII chars only have values 0 to 127 so it does not overflow in the casting process.
  3. There are also functions on multiple bytes (_mm_crc32_u{16,32,64}), should I perhaps use these for better performance since they can hash up to 4 bytes at once?

Details

#include <nmmintrin.h> provides the functions above. I am compiling it with clang flag -msse4.2

VaNa
  • 311
  • 4
  • 17
  • CRC is *not* a hash function. Well, not a *good* hash function. – Eugene Sh. Dec 20 '18 at 14:58
  • @EugeneSh. I know. It can be used as one however and since it is implemented by CPUs on the HW level it could be pretty fast and I am aiming for fast. See https://stackoverflow.com/questions/10953958/can-crc32-be-used-as-a-hash-function – VaNa Dec 20 '18 at 15:00
  • Well, as @EugeneSh. stated, CRC is not a great hash function. Yes, it is fast, but depending on why do you need it it might not produce the hash of acceptable quality. For example, it certainly is not cryptographic quality. – SergeyA Dec 20 '18 at 15:09
  • @SergeyA I don't need it to be secure, just fast with not that many collisions and that seems to be the perfect application. – VaNa Dec 20 '18 at 15:21
  • 1
    Possible duplicate of [Implementing SSE 4.2's CRC32C in software](https://stackoverflow.com/questions/17645167/implementing-sse-4-2s-crc32c-in-software) – Mark Adler Dec 21 '18 at 03:22
  • @MarkAdler I have seen the question and judged it is not a duplicate since the name states "in software". Also I wanted ot implement C version first since I'm not yet familiar with `x86 asm` so I will have to take time to understand your code. – VaNa Dec 21 '18 at 11:00
  • 1
    Did you look at the linked answer? It provides code to do it with and without the crc32c instruction, automatically selecting depending on whether the processor has the instruction. The code also makes efficient use of pipelining with the crc32c instruction, running three of those instructions in parallel on a single core. It also answers your question on the use of crc32q and crc32b. – Mark Adler Dec 21 '18 at 15:12

1 Answers1

2

I think, you misunderstand the way those functions work. They are expected to be called in sequence for every character in the string you need to calculate CRC for (or word if you use larger argument version of it, which you should).

But the idea remains the same: first you init CRC to 0, than you call CRC function in a loop, giving the previous value of CRC in the first argument, and hashed value in the second. You store the result in CRC and keep calm and carry on.

Here is code example:

uint64_t byte_crc32(unsigned int crc, const char** buf, size_t len) {
    while (len > 0) {
        crc = _mm_crc32_u8(crc, *(const unsigned char*)(*buf));
        ++*buf;
        --len;
    }

    return crc;
}

uint64_t hw_crc32(unsigned int crc, const char** buf, size_t len) {
    while (len > 0) {
        crc = _mm_crc32_u16(crc, *(const uint16_t*)(*buf));
        *buf += 2;
        --len;
    }

    return crc;
}

uint64_t word_crc32(unsigned int crc, const char** buf, size_t len) {
    while (len > 0) {
        crc = _mm_crc32_u32(crc, *(const uint32_t*)(*buf));
        *buf += 4;
        --len;
    }

    return crc;
}

uint64_t dword_crc32(uint64_t crc, const char** buf, size_t len) {
    while (len > 0) {
        crc = _mm_crc32_u64(crc, *(const uint64_t*)(*buf));
        *buf += 8;
        --len;
    }

    return crc;
}


uint64_t crc(const char* buff, size_t len) {

    const size_t dword_chunks = len / 8;
    const size_t dword_diff = len % 8;

    const size_t word_chunks = dword_diff / 4;
    const size_t word_diff = dword_diff % 4;

    const size_t hw_chunks = word_diff / 2;
    const size_t hw_diff = word_diff % 2;

    uint64_t crc = byte_crc32(0, &buff, hw_diff);
    crc = hw_crc32(crc, &buff, hw_chunks);
    crc = word_crc32(crc, &buff, word_chunks);
    crc = dword_crc32(crc, &buff, dword_chunks);

    return crc;
}
SergeyA
  • 61,605
  • 5
  • 78
  • 137
  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackoverflow.com/rooms/185581/discussion-on-answer-by-sergeya-how-to-implement-a-hash-function-for-strings-in). – Samuel Liew Dec 21 '18 at 01:32
  • Shouldn't dword_crc32 be using `*buf += 8` ? I'm wondering if most compilers are smart enough to optimize the update to buf so that a local copy (register) is updated during the loop and the parameter buf is only updated once after the loop completes. – rcgldr Dec 21 '18 at 13:34
  • @rcgldr certainly so, thanks for noticing. This code went through a couple of edits, and this was an oversight. Fixing it. As for optimization - try and see :) – SergeyA Dec 21 '18 at 14:50