Is there any way to speedup be32 encoding in C?

Question

Is there any way to speedup be32enc in C? Here's an example of what I do for uint32_t:

for (int i=0; i < 19; i++) {
    be32enc(&endiandata[i], pdata[i]);
}

And the function itself:

static inline void be32enc(void *pp, uint32_t x)
{
 uint8_t *p = (uint8_t *)pp;
 p[3] = x & 0xff;
 p[2] = (x >> 8) & 0xff;
 p[1] = (x >> 16) & 0xff;
 p[0] = (x >> 24) & 0xff;
}

I've googled hard, but haven't found anything - this topic is not so popular. Target CPU for this would be i3-7350k and I use msvc2017. May use MIT/GPL libs as well.

I disagree with the duplicate. OP does not seem to intend bit-reversal, only byte-reversal. — Yunnosch, Jan 10 '18 at 09:32
https://stackoverflow.com/questions/2602823/in-c-c-whats-the-simplest-way-to-reverse-the-order-of-bits-in-a-byte — Tatsuyuki Ishi, Jan 10 '18 at 09:34
I would rely on the `hton` function. It is a utility function for TCP/IP networking that convert a 32 bit integer from host endianess to network (big endian) one. You could try to benchmark it on your system against your own implementation — Serge Ballesta, Jan 10 '18 at 09:49
If you need speed you might want search for *intrinsic functions*. These are usually provided by compiler and map to single CPU instruction. I don't know for sure, but I suspect that x86 also has 32-bit byte reversal instruction. — user694733, Jan 10 '18 at 10:00
x86 has both an explicit byte-swap and (recently) a store-big-endian — harold, Jan 10 '18 at 10:47
`this topic is not so popular` because you didn't use the correct keyword. Look for "reverse bytes" or "convert endianness"... and you'll see that it's extremely common — phuclv, Jan 10 '18 at 11:31
Possible duplicate of [Quickest way to change endianness](https://stackoverflow.com/questions/7279393/quickest-way-to-change-endianness) — phuclv, Jan 12 '18 at 06:38
[How do I convert between big-endian and little-endian values in C++?](https://stackoverflow.com/q/105252/995714) — phuclv, Jan 12 '18 at 06:39

score 3 · Answer 1 · answered Jan 10 '18 at 10:20

3

There are two modifications that are likely to improve the performance of your be32inc function. First get rid of the pointer magic and make it a function from uint32_t to uint32_t. Second, if you don't need to be portable to other architectures than x86, implement it using the _bswap-intrinsic.

answered Jan 10 '18 at 10:20

Johan

3,667
6
20
25

instead of `_bswap`, use [`__builtin_bswap32`](https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html) for better portability – phuclv Jan 12 '18 at 06:38

technosaurus · Answer 2 · 2018-01-10T10:57:07.637

1

If you have a decent compiler, you should be able to use builtins (btw there is a BSD standard function that does what you want, htobe32()):

#ifndef I_HAVE_A_CRAP_COMPILER
#define bswap32(x) __builtin_bswap32(x)

#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
#define htobe32(x) bswap32(x)
#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
#define htobe32(x) (x)
#else
#error Must be little or big endian
#endif

#else
/*your implementation here*/
#endif

Edit: if you want to try your C library's builtin htobe32() function you can:

#define _BSD_SOURCE
#include <endian.h>

Though the compiler builtin will likely be faster, since it will avoid a function call altogether and inline efficient assembly (a single bswap instruction on x86 and x86_64)

edited Jan 10 '18 at 10:57

answered Jan 10 '18 at 10:37

technosaurus

7,676
1
30
52

newer x86 CPUs have [`movbe`](https://stackoverflow.com/q/5246146/995714) which may be better in many cases – phuclv Jan 10 '18 at 15:43
@LưuVĩnhPhúc The compiler (at least clang and gcc) will do that automatically for atom and haswell or later using the builtin bswap when compiled with the appropriate "-march=" parameter, but it is not available on all x86_64 architectures. – technosaurus Jan 10 '18 at 15:47

Is there any way to speedup be32 encoding in C?

2 Answers2