14

Basically I want in my code to be able to do this:

 Engine.getById(WSID('some-id'));

Which should get transformed by

 Engine.getById('1a61bc96');

just before being compiled into asm. So at compile-time.

This is my try

constexpr int WSID(const char* str) {
    boost::crc_32_type result;
    result.process_bytes(str,sizeof(str));
    return result.checksum();
}

But I get this when trying to compile with MSVC 18 (CTP November 2013)

error C3249: illegal statement or sub-expression for 'constexpr' function

How can I get the WSID function, using this way or any, as long as it is done during compile time?

Tried this: Compile time string hashing

 warning C4592: 'crc32': 'constexpr' call evaluation failed; function will be called at run-time

EDIT:

I first heard about this technique in Game Engine Architecture by Jason Gregory. I contacted the author who obligingly answer to me this :

What we do is to pass our source code through a custom little pre-processor that searches for text of the form SID('xxxxxx') and converts whatever is between the single quotes into its hashed equivalent as a hex literal (0xNNNNNNNN). [...]

You could conceivably do it via a macro and/or some template metaprogramming, too, although as you say it's tricky to get the compiler to do this kind of work for you. It's not impossible, but writing a custom tool is easier and much more flexible. [...]

Note also that we chose single quotes for SID('xxxx') literals. This was done so that we'd get some reasonable syntax highlighting in our code editors, yet if something went wrong and some un-preprocessed code ever made it thru to the compiler, it would throw a syntax error because single quotes are normally reserved for single-character literals.

Note also that it's crucial to have your little pre-processing tool cache the strings in a database of some sort, so that the original strings can be looked up given the hash code. When you are debugging your code and you inspect a StringId variable, the debugger will normally show you the rather unintelligible hash code. But with a SID database, you can write a plug-in that converts these hash codes back to their string equivalents. That way, you'll see SID('foo') in your watch window, not 0x75AE3080 [...]. Also, the game should be able to load this same database, so that it can print strings instead of hex hash codes on the screen for debugging purposes [...].

But while preprocess has some main advantages, it means that I have to prepare some kind of output system of modified files (those will be stored elsewhere, and then we need to tell MSVC). So it might complicate the compiling task. Is there a way to preprocess file with python for instance without headaches? But this is not the question, and I'm still interested about using compile-time function (about cache I could use an ID index)

Community
  • 1
  • 1
Vinz243
  • 9,654
  • 10
  • 42
  • 86
  • 2
    See the answers here: http://stackoverflow.com/questions/3226211/why-is-it-ill-formed-to-have-multi-line-constexpr-functions – Nim Feb 23 '15 at 14:57
  • @Nim So I have to rewrite it in one line? – Vinz243 Feb 23 '15 at 15:07
  • @Vinz243 If you're using C++11 and not -14, yes. Unless you can persuade the compiler to run a non-constexpr function at compile time nonetheless. – Columbo Feb 23 '15 at 15:08
  • Also, are we talking about strings or is that multi-character constant part of your intention? – Columbo Feb 23 '15 at 15:11
  • My point was, I don't think you can do this (because the boost crc computer) is quite complex, and is unlikely to be `constexpr` capable. If you really wanted to make this compile time, you'll have to hack some template goodness... – Nim Feb 23 '15 at 15:14
  • 3
    There was a code golf to do this some years ago. http://codegolf.stackexchange.com/questions/3268/compute-the-crc32-table-at-compile-time – Moby Disk Feb 23 '15 at 15:22
  • Hum... So forget about constexpr and use processor macros. – Vinz243 Feb 23 '15 at 15:36
  • @MobyDisk but this only give me the table, how do I hash a string then? – Vinz243 Feb 23 '15 at 15:57
  • 1
    The first error message "type not allowed for 'constexpr'" is wrong or misleading: `std::string const&` is a literal type (since it's a reference) and hence allowed as the type for a function parameter of a constexpr function. clang++ and g++ also accept it (in C++11 mode). – dyp Feb 23 '15 at 16:49
  • @dyp fixed first error, thanks! See edit – Vinz243 Feb 23 '15 at 17:35

3 Answers3

21

Here is a solution that works entirely at compile time, but may also be used at runtime. It is a mix of constexpr, templates and macros. You may want to change some of the names or put them in a separate file since they are quite short.

Note that I reused code from this answer for the CRC table generation and I based myself off of code from this page for the implementation.

I have not tested it on MSVC since I don't currently have it installed in my Windows VM, but I believe it should work, or at least be made to work with trivial changes.

Here is the code, you may use the crc32 function directly, or the WSID function that more closely matches your question :

#include <cstring>
#include <cstdint>
#include <iostream>

// Generate CRC lookup table
template <unsigned c, int k = 8>
struct f : f<((c & 1) ? 0xedb88320 : 0) ^ (c >> 1), k - 1> {};
template <unsigned c> struct f<c, 0>{enum {value = c};};

#define A(x) B(x) B(x + 128)
#define B(x) C(x) C(x +  64)
#define C(x) D(x) D(x +  32)
#define D(x) E(x) E(x +  16)
#define E(x) F(x) F(x +   8)
#define F(x) G(x) G(x +   4)
#define G(x) H(x) H(x +   2)
#define H(x) I(x) I(x +   1)
#define I(x) f<x>::value ,

constexpr unsigned crc_table[] = { A(0) };

// Constexpr implementation and helpers
constexpr uint32_t crc32_impl(const uint8_t* p, size_t len, uint32_t crc) {
    return len ?
            crc32_impl(p+1,len-1,(crc>>8)^crc_table[(crc&0xFF)^*p])
            : crc;
}

constexpr uint32_t crc32(const uint8_t* data, size_t length) {
    return ~crc32_impl(data, length, ~0);
}

constexpr size_t strlen_c(const char* str) {
    return *str ? 1+strlen_c(str+1) : 0;
}

constexpr int WSID(const char* str) {
    return crc32((uint8_t*)str, strlen_c(str));
}

// Example usage
using namespace std;

int main() {
    cout << "The CRC32 is: " << hex << WSID("some-id") << endl;
}

The first part takes care of generating the table of constants, while crc32_impl is a standard CRC32 implementation converted to a recursive style that works with a C++11 constexpr. Then crc32 and WSID are just simple wrappers for convenience.

Community
  • 1
  • 1
tux3
  • 7,171
  • 6
  • 39
  • 51
  • 2
    That's pretty slick. Would you please explain what you are doing with the the template classes a bit more? It looks like you are recursively inheriting until k = 0 and then defining the final template param as value? I'm not well versed in template class specialization and what you seem to be doing here. – jschultz410 Mar 02 '15 at 05:48
  • @jschultz410 we are essentially using the same technique twice, with the constexpr and with the templates. We're turning [the reference implementation](http://www.faqs.org/rfcs/rfc1952.html) into a recursive function. The templates implement the core loop `for (k = 0; k < 8; k++){...}` but using recursion, and the macros replace the outer loop `for (n = 0; n < 256; n++) {...}` by simply calling the template 256 times. – tux3 Mar 02 '15 at 07:34
  • Thanks you! But it says `warning C4592: 'crc32_impl': 'constexpr' call evaluation failed; function will be called at run-time` and same for crc32 :/ – Vinz243 Mar 02 '15 at 11:58
  • @Vinz243 [A comment on this post](http://blogs.msdn.com/b/vcblog/archive/2013/11/18/announcing-the-visual-c-compiler-november-2013-ctp.aspx?PageIndex=2) says "The compiler team has confirmed that this warning is spurious - constexpr evaluation will succeed anyways." and [this bug report](http://connect.microsoft.com/VisualStudio/feedbackdetail/view/931288/warning-when-constexpr-function-calls-another-constexpr-function-visual-c-november-2013-ctp) says it is fixed. Does it work with an up-to-date version of MSVC, and have you tried stepping through with a debugger ? – tux3 Mar 02 '15 at 12:24
  • I need to get vs2012 working to get the debugger (licence problem) – Vinz243 Mar 02 '15 at 13:49
  • @Vinz243 alright. Perhaps it would be wise to upgrade to VS 2013 or even the more recent CTPs if you want to use more modern features. – tux3 Mar 02 '15 at 13:52
  • I'll try VS 2015 preview :) – Vinz243 Mar 02 '15 at 17:01
  • @Vinz243 great! I made a couple of changes, maybe it will work better now. – tux3 Mar 02 '15 at 18:08
  • I'll tell you ASAP (that means tomorrow or wenesday :(). Can't wait ! – Vinz243 Mar 02 '15 at 18:09
  • Visual Studio 2015 didn't throw any warning. However, when stepping into the line using debugger, it jumps into WSID and the other functions :'( (it took 10 ms almost) – Vinz243 Mar 03 '15 at 11:14
  • @Vinz243 that's strange, what if you do `constexpr int result = WSID("some-id"); cout< – tux3 Mar 03 '15 at 11:27
  • I got `fatal error C1001: An internal error has occurred in the compiler. 1> (compiler file 'f:\dd\vctools\compiler\cxxfe\sl\p1\c\constexpr.cpp', line 1208)` – Vinz243 Mar 03 '15 at 18:46
  • @Vinz234 Welp. If MSVC 2015 still can't handle `constexpr`, getting a compile-time CRC32 function is not going to be easy :/ I'll try to install it and run some tests if you want, I'm surprised it takes so little to get an internal error out of it. – tux3 Mar 03 '15 at 18:51
  • I awarded you the bounty, given your score even though it didn't worked on MSVC. I prefer awarding bounties rather than losing them – Vinz243 Mar 06 '15 at 22:08
  • @Vinz243 Thanks! I'm sorry it crashes MSVC, I really wasn't expecting that. If think the idea of a preprocessing tool might work instead. – tux3 Mar 06 '15 at 22:17
  • As @KevinKeane said! But however I'm switching to unreal for now so... We should be able to split up bounties – Vinz243 Mar 06 '15 at 22:19
  • that's awesome! a nice implementation. actually very similar to what I was looking for :) – Alexander Oh Nov 19 '15 at 10:02
  • in c++20 you don't need recursion at all , it causes stack overflow while using crc32 on big buffers. reinterpret cast (or C-Style cast) is forbidden on constexpr, so your sample won't compile if used correctly with `constexpr`. – IkarusDeveloper Mar 11 '22 at 02:58
3

If anyone is interested, I coded up a CRC-32 table generator function and code generator function using C++14 style constexpr functions. The result is, in my opinion, much more maintainable code than many other attempts I have seen on the internet and it stays far, far away from the preprocessor.

Now, it does use a custom std::array 'clone' called cexp::array, because G++ seems to not have not added the constexpr keyword to their non-const reference index access/write operator.

However, it is quite light-weight, and hopefully the keyword will be added to std::array in the close future. But for now, the very simple array implementation is as follows:

namespace cexp
{

    // Small implementation of std::array, needed until constexpr
    // is added to the function 'reference operator[](size_type)'
    template <typename T, std::size_t N>
    struct array {
        T m_data[N];

        using value_type = T;
        using reference = value_type &;
        using const_reference = const value_type &;
        using size_type = std::size_t;

        // This is NOT constexpr in std::array until C++17
        constexpr reference operator[](size_type i) noexcept {
            return m_data[i];
        }

        constexpr const_reference operator[](size_type i) const noexcept {
            return m_data[i];
        }

        constexpr size_type size() const noexcept {
            return N;
        }
    };

}

Now, we need to generate the CRC-32 table. I based the algorithm off some Hacker's Delight code, and it can probably be extended to support the many other CRC algorithms out there. But alas, I only required the standard implementation, so here it is:

// Generates CRC-32 table, algorithm based from this link:
// http://www.hackersdelight.org/hdcodetxt/crc.c.txt
constexpr auto gen_crc32_table() {
    constexpr auto num_bytes = 256;
    constexpr auto num_iterations = 8;
    constexpr auto polynomial = 0xEDB88320;

    auto crc32_table = cexp::array<uint32_t, num_bytes>{};

    for (auto byte = 0u; byte < num_bytes; ++byte) {
        auto crc = byte;

        for (auto i = 0; i < num_iterations; ++i) {
            auto mask = -(crc & 1);
            crc = (crc >> 1) ^ (polynomial & mask);
        }

        crc32_table[byte] = crc;
    }

    return crc32_table;
}

Next, we store the table in a global and perform rudimentary static checking on it. This checking could most likely be improved, and it is not necessary to store it in a global.

// Stores CRC-32 table and softly validates it.
static constexpr auto crc32_table = gen_crc32_table();
static_assert(
    crc32_table.size() == 256 &&
    crc32_table[1] == 0x77073096 &&
    crc32_table[255] == 0x2D02EF8D,
    "gen_crc32_table generated unexpected result."
);

Now that the table is generated, it's time to generate the CRC-32 codes. I again based the algorithm off the Hacker's Delight link, and at the moment it only supports input from a c-string.

// Generates CRC-32 code from null-terminated, c-string,
// algorithm based from this link:
// http://www.hackersdelight.org/hdcodetxt/crc.c.txt 
constexpr auto crc32(const char *in) {
    auto crc = 0xFFFFFFFFu;

    for (auto i = 0u; auto c = in[i]; ++i) {
        crc = crc32_table[(crc ^ c) & 0xFF] ^ (crc >> 8);
    }

    return ~crc;
}

For sake of completion, I generate one CRC-32 code below and statically check if it has the expected output, and then print it to the output stream.

int main() {
    constexpr auto crc_code = crc32("some-id");
    static_assert(crc_code == 0x1A61BC96, "crc32 generated unexpected result.");

    std::cout << std::hex << crc_code << std::endl;
}

Hopefully this helps anyone else that was looking to achieve compile time generation of CRC-32, or even in general.

Deus Sum
  • 146
  • 2
1

@tux3's answer is pretty slick! Hard to maintain, though, because you are basically writing your own implementation of CRC32 in preprocessor commands.

Another way to solve your question is to go back and understand the need for the requirement first. If I understand you right, the concern seems to be performance. In that case, there is a second point of time you can call your function without performance impact: at program load time. In that case, you would be accessing a global variable instead of passing a constant. Performance-wise, after initialization both should be identical (a const fetches 32 bits from your code, a global variable fetches 32 bits from a regular memory location).

You could do something like this:

static int myWSID = 0;

// don't call this directly
static int WSID(const char* str) {
  boost::crc_32_type result;
  result.process_bytes(str,sizeof(str));
  return result.checksum();
}

// Put this early into your program into the
// initialization code.
...
myWSID = WSID('some-id');

Depending on your overall program, you may want to have an inline accessor to retrieve the value.

If a minor performance impact is acceptable, you would also write your function like this, basically using the singleton pattern.

// don't call this directly
int WSID(const char* str) {
  boost::crc_32_type result;
  result.process_bytes(str,sizeof(str));
  return result.checksum();
}

// call this instead. Note the hard-coded ID string.
// Create one such function for each ID you need to
// have available.
static int myWSID() {
   // Note: not thread safe!
   static int computedId = 0;
   if (computedId == 0)
      computedId = WSID('some-id');
   return computedId;
}

Of course, if the reason for asking for compile-time evaluation is something different (such as, not wanting some-id to appear in the compiled code), these techniques won't help.

The other option is to use Jason Gregory's suggestion of a custom preprocessor. It can be done fairly cleanly if you collect all the IDS into a separate file. This file doesn't need to have C syntax. I'd give it an extension such as .wsid. The custom preprocessor generates a .H file from it.

Here is how this could look:

idcollection.wsid (before custom preprocessor):

some_id1
some_id2
some_id3

Your preprocessor would generate the following idcollection.h:

#define WSID_some_id1 0xabcdef12
#define WSID_some_id2 0xbcdef123
#define WSID_some_id3 0xcdef1234

And in your code, you'd call

Engine.getById(WSID_some_id1);

A few notes about this:

  • This assumes that all the original IDs can be converted into valid identifiers. If they contain special characters, your preprocessor may need to do additional munging.
  • I notice a mismatch in your original question. Your function returns an int, but Engine.getById seems to take a string. My proposed code would always use int (easy to change if you want always string).
Kevin Keane
  • 1,506
  • 12
  • 24
  • 1
    This should have been a comment. You're pasting my code, which doesn't answer directly the question (it's rather a workaround). But +1 for the second proposal. – Vinz243 Mar 06 '15 at 22:07
  • I probably should have broken it into a comment and a separate answer. Yes, I did use your code as a starting point. The key difference between my version and yours is in running it only once during the initialization. The way I look at requests like yours is, you have an underlying problem to solve, and I took a guess at what the problem might be, and came up with an alternative solution for the that. If that doesn't help you - no problem. Thus the second suggestion. Thanks for the upvote! – Kevin Keane Mar 07 '15 at 07:44
  • This might have helped me. But the question team question was and crc32 at compile time. Hence the comment for the first part. – Vinz243 Mar 07 '15 at 09:29