Update: @chtz's hack totally works. It tricks the compiler into not realizing it's building an int from a char array.
Solution 1/4: use a macro to hack together a uint32_t
by manually calculating it from 4 bytes
Update 2: consider endianness. x86-64 systems are little-endian. I originally mistakenly used the big-endian hash:
// For big-endian byte ordering
uint32_t num = ((chars[0]*256 + chars[1])*256 + chars[2])*256 + chars[3];
// Update 2: reverse the order for correct endianness:
// For little-endian byte ordering
uint32_t num = ((chars[3]*256 + chars[2])*256 + chars[1])*256 + chars[0];
test.cpp:
///usr/bin/env ccache g++ -Wall -Wextra -Werror -O3 -std=gnu++17 "$0" -o /tmp/a && /tmp/a "$@"; exit
// For the line just above, see my answer here: https://stackoverflow.com/a/75491834/4561887
#include <iostream>
#define HASH4(s) ((((s)[0]*256+(s)[1])*256+(s)[2])*256+(s)[3])
void check_int(int i)
{
switch(i)
{
case HASH4("FOOD"):
printf("FOOD\n");
break;
case HASH4("TREE"):
printf("TREE\n");
break;
}
}
int main()
{
std::cout << "Test\n";
int something = HASH4("FOOD");
printf("something = %i\n", something); // something = 1179602756
check_int(something);
something = 1179602756;
check_int(something);
// ----------------------------
// withOUT using a #define now
// ----------------------------
something = ((('F'*256+'O')*256+'O')*256+'D');
switch(something)
{
case ((('F'*256+'O')*256+'O')*256+'D'):
printf("FOOD\n");
break;
}
return 0;
}
Run cmd:
chmod +x test.cpp # make executable
./test.cpp # run it
Output:
Test
something = 1179602756
FOOD
FOOD
FOOD
Without using a #define
: this works fine because ((('F'*256+'O')*256+'O')*256+'D')
is a constant expression!--it is totally calculated into a constant value at compile-time.
Solution 2/4 (better): use a constant expression function hack instead of the macro hack above
@Pepijn Kramer is right that constexpr
functions can be used to replace macros which just do pre-compile-time calculations. In other words, constexpr
functions can replace some macros. constexpr
functions may be preferred because they have type safety and checking and avoid the double-evaluation problem that macros have when you pass an expression or assignment into them.
constexpr
functions will evaluate into constexpr
results during compile-time if able, and as regular results during runtime otherwise. So, they are like a mix of the functionality of some macros + regular functions.
Here's one solution, passing a std::array
of 4 chars into a constexpr
function:
///usr/bin/env ccache g++ -Wall -Wextra -Werror -O3 -std=gnu++17 "$0" -o /tmp/a && /tmp/a "$@"; exit
// For the line just above, see my answer here: https://stackoverflow.com/a/75491834/4561887
#include <array>
#include <iostream>
constexpr uint32_t hash4chars(const std::array<char, 4>& chars)
{
// For big-endian byte ordering
// uint32_t num = ((chars[0]*256 + chars[1])*256 + chars[2])*256 + chars[3];
// Update: reverse the order for correct endianness:
// For little-endian byte ordering
uint32_t num = ((chars[3]*256 + chars[2])*256 + chars[1])*256 + chars[0];
return num;
}
void check_int(int i)
{
switch(i)
{
case hash4chars({'F', 'O', 'O', 'D'}):
printf("FOOD\n");
break;
case hash4chars({'T', 'R', 'E', 'E'}):
printf("TREE\n");
break;
}
}
int main()
{
std::cout << "Test\n";
uint32_t num = hash4chars({'F', 'O', 'O', 'D'});
printf("num = %u\n", num);
check_int(num);
// convert the num back to a char array to check that it was converted
// correctly
const char* str = (const char*)(&num);
printf("%c%c%c%c\n", str[0], str[1], str[2], str[3]);
return 0;
}
Run and output, showing that the 4 bytes in FOOD
turn into the uint32_t
number of 1146048326
, and that number turns back into the 4 chars FOOD
on my x86-64 Linux system (which is little endian):
$ ./test.cpp
Test
num = 1146048326
FOOD
FOOD
Solution 3/4: (best so far) constexpr
function hack using a std::string_view
as input, instead of the std::array
just above
Even better still, use a std::string_view
as the input parameter so you can still pass in raw C-string to it. Here is a full example:
///usr/bin/env ccache g++ -Wall -Wextra -Werror -O3 -std=gnu++17 "$0" -o /tmp/a && /tmp/a "$@"; exit
// For the line just above, see my answer here: https://stackoverflow.com/a/75491834/4561887
#include <cstdint>
#include <iostream>
#include <string_view>
constexpr uint32_t hash4chars(const std::string_view& sv)
{
// Error checking: ensure all inputs have only 4 chars.
// Note: as really crude error checking, we'll just return the sentinel
// value of `UINT32_MAX` if this error occurs. Better techniques exist
if (sv.size() != 4)
{
printf("Error: the string view should be 4 chars long!\n");
return UINT32_MAX;
}
// static_assert(sv.size() == 4); // doesn't work
// For big-endian byte ordering
// uint32_t num = ((sv[0]*256 + sv[1])*256 + sv[2])*256 + sv[3];
// Update: reverse the order for correct endianness:
// For little-endian byte ordering
uint32_t num = ((sv[3]*256 + sv[2])*256 + sv[1])*256 + sv[0];
return num;
}
void check_int(int i)
{
switch(i)
{
case hash4chars("FOOD"):
printf("FOOD\n");
break;
case hash4chars("TREE"):
printf("TREE\n");
break;
}
}
int main()
{
std::cout << "Test\n";
uint32_t num = hash4chars("FOOD");
printf("num = %u\n", num);
check_int(num);
// convert the num back to a char array to check that it was converted
// correctly
const char* str = (const char*)(&num);
printf("%c%c%c%c\n", str[0], str[1], str[2], str[3]);
return 0;
}
Run and output (exact same as previously):
$ ./test.cpp
Test
num = 1146048326
FOOD
FOOD
Solution 4/4: don't convert 4 bytes to integers; just hash the string directly, as a string view, using built-in C++ hash functions
Based on the fact you are calling your macro HASH4()
and HASH8()
in the question, it seems you really just want a unique or near-unique hash of the input string? ie: you don't actually need to convert its equivalent-space integer representation; rather, you just need a hash of it.
In that case, you can also just use C++'s built-in std::hash<>{}()
functor. See here:
- https://en.cppreference.com/w/cpp/utility/hash - general documentation
- https://en.cppreference.com/w/cpp/string/basic_string_view/hash - documentation on the
std::string_view
specialization of it
- https://en.cppreference.com/w/cpp/string/basic_string_view/operator%22%22sv - meaning of
operator""sv()
function, used as "my_string"sv
to produce a std::string_view
from C-string "my_string"
in the examples just above
But, std::hash<>{}()
is not a constexpr
function, so you can not use it in switch cases either! Rather, you must use the if
else
style of checking.
How to read std::hash<>{}()
:
std
is the namespace
<>
specifies the template type
{}
constructs a default object of this class type
()
calls the operator()
(parenthesis function-like [or "functor"] operator; see here and here) on this object, which in this case is the function to perform the hash on the parameters inside those parenthesis.
Note: the following code works great, and may be the most beloved by many C++ people, but I find it pretty complicated and perhaps too "C++"-y. Your call. It also isn't a constexpr
expression. I'm happy I have finally reached the point after 3 years of daily C++ usage that I can even read and write this myself, however, and having access to a quick hash of C-strings (interpreted as std::string_view
s) is in fact nice to have as part of the C++ language.
///usr/bin/env ccache g++ -Wall -Wextra -Werror -O3 -std=gnu++17 "$0" -o /tmp/a && /tmp/a "$@"; exit
// For the line just above, see my answer here: https://stackoverflow.com/a/75491834/4561887
#include <iostream>
#include <string_view>
void check_hash(std::size_t hash)
{
if (hash == std::hash<std::string_view>{}(std::string_view{"FOOD", 4}))
{
printf("FOOD\n");
}
else if (hash == std::hash<std::string_view>{}(std::string_view{"TREE", 4}))
{
printf("TREE\n");
}
}
int main()
{
std::cout << "Test\n";
std::size_t num
= std::hash<std::string_view>{}(std::string_view{"FOOD", 4});
printf("num = %lu\n", num);
check_hash(num);
return 0;
}
Run and output:
$ ./test.cpp
Test
num = 16736621008042147638
FOOD
That std::hash
functor is pretty ugly, so you if you like, you can beautify it a bit by wrapping it with a macro:
#define HASH(string, num_chars) \
std::hash<std::string_view>{}(std::string_view{(string), (num_chars)})
Example:
#include <iostream>
#include <string_view>
#define HASH(string, num_chars) \
std::hash<std::string_view>{}(std::string_view{(string), (num_chars)})
void check_hash(std::size_t hash)
{
if (hash == HASH("FOOD", 4))
{
printf("FOOD\n");
}
else if (hash == HASH("TREE", 4))
{
printf("TREE\n");
}
}
int main()
{
std::cout << "Test\n";
std::size_t num = HASH("FOOD", 4);
printf("num = %lu\n", num);
check_hash(num);
return 0;
}
The output is the same as just above.
Going further
If you want to look more into conversions of memory blobs to and from byte arrays, see also my other answers here:
- How to convert a
struct
variable to uint8_t
array in C:
- Answer 1/3: use a union and a packed struct
- Answer 2/3: convert a struct to an array of bytes via manual bit-shifting
- Answer 3/3: use a packed struct and a raw uint8_t pointer to it
Other info to consider and understand
To make 4 bytes get interpreted as a constant 4-byte int (const int32_t
), simply use
// this
#define CONST_INT32(bytes) (*((const int32_t*)(bytes)))
// instead of this
#define CONST_INT32(bytes) (*((int32_t*)(bytes)))
ie: add const
before your pointer cast.
But, that gets you a const int32_t
, which is not the same as a constexpr int32_t
constant expression int32_t. A constant expression tells the compiler that this piece of memory won't be trifled with, edited, or reinterpret-casted as another type. The fact that you are reinterpret-casting 4 bytes into an int via a macro already violates this.
So, no, in C++ there is no preprocessor macro way I am aware of to forcefully interpret 4 bytes as a constexpr int
.
You can reinterpret 4 bytes as a const int
instead, but that's not the same thing. Only constexpr
types can be used as cases in a switch statement, so @dbush's answer is right. Use an if
else
to check the const int
values instead.
Note: if you declare a const int
, the compiler may see it could also be a constexpr int
and make that decision for you. So, this runs:
#include <iostream>
int main()
{
std::cout << "Test\n";
const int CASE1 = 7; // compiler sees these could also be constexpr
const int CASE2 = 8; // compiler sees these could also be constexpr
int something = CASE1;
switch(something)
{
case CASE1:
printf("CASE1\n");
break;
case CASE2:
printf("CASE2\n");
break;
}
return 0;
}
...as well as this:
#include <iostream>
int main()
{
std::cout << "Test\n";
constexpr int CASE1 = 7; // you are explicitly making these constexpr
constexpr int CASE2 = 8; // you are explicitly making these constexpr
int something = CASE1;
switch(something)
{
case CASE1:
printf("CASE1\n");
break;
case CASE2:
printf("CASE2\n");
break;
}
return 0;
}