0

Is there any way to speedup be32enc in C? Here's an example of what I do for uint32_t:

for (int i=0; i < 19; i++) {
    be32enc(&endiandata[i], pdata[i]);
}

And the function itself:

static inline void be32enc(void *pp, uint32_t x)
{
 uint8_t *p = (uint8_t *)pp;
 p[3] = x & 0xff;
 p[2] = (x >> 8) & 0xff;
 p[1] = (x >> 16) & 0xff;
 p[0] = (x >> 24) & 0xff;
}

I've googled hard, but haven't found anything - this topic is not so popular. Target CPU for this would be i3-7350k and I use msvc2017. May use MIT/GPL libs as well.

YAR
  • 9
  • Can you rely on the x being little endian? – Yunnosch Jan 10 '18 at 09:27
  • https://godbolt.org/g/hWiiQb – Tatsuyuki Ishi Jan 10 '18 at 09:31
  • I disagree with the duplicate. OP does not seem to intend bit-reversal, only byte-reversal. – Yunnosch Jan 10 '18 at 09:32
  • https://stackoverflow.com/questions/2602823/in-c-c-whats-the-simplest-way-to-reverse-the-order-of-bits-in-a-byte – Tatsuyuki Ishi Jan 10 '18 at 09:34
  • 4
    I would rely on the `hton` function. It is a utility function for TCP/IP networking that convert a 32 bit integer from host endianess to network (big endian) one. You could try to benchmark it on your system against your own implementation – Serge Ballesta Jan 10 '18 at 09:49
  • If you need speed you might want search for *intrinsic functions*. These are usually provided by compiler and map to single CPU instruction. I don't know for sure, but I suspect that x86 also has 32-bit byte reversal instruction. – user694733 Jan 10 '18 at 10:00
  • 1
    x86 has both an explicit byte-swap and (recently) a store-big-endian – harold Jan 10 '18 at 10:47
  • `this topic is not so popular` because you didn't use the correct keyword. Look for "reverse bytes" or "convert endianness"... and you'll see that it's extremely common – phuclv Jan 10 '18 at 11:31
  • Possible duplicate of [Quickest way to change endianness](https://stackoverflow.com/questions/7279393/quickest-way-to-change-endianness) – phuclv Jan 12 '18 at 06:38
  • [How do I convert between big-endian and little-endian values in C++?](https://stackoverflow.com/q/105252/995714) – phuclv Jan 12 '18 at 06:39

2 Answers2

3

There are two modifications that are likely to improve the performance of your be32inc function. First get rid of the pointer magic and make it a function from uint32_t to uint32_t. Second, if you don't need to be portable to other architectures than x86, implement it using the _bswap-intrinsic.

Johan
  • 3,667
  • 6
  • 20
  • 25
  • instead of `_bswap`, use [`__builtin_bswap32`](https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html) for better portability – phuclv Jan 12 '18 at 06:38
1

If you have a decent compiler, you should be able to use builtins (btw there is a BSD standard function that does what you want, htobe32()):

#ifndef I_HAVE_A_CRAP_COMPILER
#define bswap32(x) __builtin_bswap32(x)

#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
#define htobe32(x) bswap32(x)
#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
#define htobe32(x) (x)
#else
#error Must be little or big endian
#endif

#else
/*your implementation here*/
#endif

Edit: if you want to try your C library's builtin htobe32() function you can:

#define _BSD_SOURCE
#include <endian.h>

Though the compiler builtin will likely be faster, since it will avoid a function call altogether and inline efficient assembly (a single bswap instruction on x86 and x86_64)

technosaurus
  • 7,676
  • 1
  • 30
  • 52
  • newer x86 CPUs have [`movbe`](https://stackoverflow.com/q/5246146/995714) which may be better in many cases – phuclv Jan 10 '18 at 15:43
  • @LưuVĩnhPhúc The compiler (at least clang and gcc) will do that automatically for atom and haswell or later using the builtin bswap when compiled with the appropriate "-march=" parameter, but it is not available on all x86_64 architectures. – technosaurus Jan 10 '18 at 15:47