0

Currently I have this function to swap the bytes of a data in order to change endianness.

template<typename Type, unsigned int Half = sizeof(Type)/2, unsigned int End = sizeof(Type)-1> 
inline void swapBytes(Type& x)
{
    char* c = reinterpret_cast<char*>(&x);
    char tmp;
    for (unsigned int i = 0; i < Half; ++i) {
        tmp = c[i];
        c[i] = c[End-i];
        c[End-i] = tmp;
    }
}

This function will be called by some algorithms of mine several million times. Consequently, every single instruction that can be avoided would be a good thing.

My question is : how can this function be optimized ?

Vincent
  • 57,703
  • 61
  • 205
  • 388
  • 1
    What is wrong with the posix versions? htons, ntohs, htonl, ntohl? – Adrian Cornish Oct 03 '12 at 02:07
  • They are not in the standard C++ library... – Vincent Oct 03 '12 at 02:09
  • I did not say they were - is that a requirement you need to meet? If so check the open source versions of these. – Adrian Cornish Oct 03 '12 at 02:11
  • If you want maximum efficiency, you will need to rely on language extensions. For example, if you're byte-swapping a lot of data, you'll need to vectorize. Maximum portability and maximum performance usually conflict. – Mysticial Oct 03 '12 at 02:11
  • do you really need to use templates and that sort of stuff? if you're really going to call this routine millions of time, and you care about your time, isn't it worth it to implement a less versatile function but that does the swapping really fast? – Castilho Oct 03 '12 at 02:14
  • And if you truly want to optimize it, do not use C++ but write it in native assembler for the processor you are targeting. – Adrian Cornish Oct 03 '12 at 02:19
  • @AdrianCornish Unless you use some special instructions that the compiler's optimizer does not use, it is hard to outdo a good optimizer by coding in assembly manually. – Sergey Kalinichenko Oct 03 '12 at 02:27
  • @dasblinkenlight Very true - but hand rolled assembler may be better if you have a true handle on the issue. – Adrian Cornish Oct 03 '12 at 02:33
  • @dasblinkenlight: Assuming that the compiler recognizes what you are doing. Some processors have specific instructions (single instruction) that will swap the bytes, that can be a single line of manual assembly or hoping that the compiler will recognize the pattern and substitute the multiple operations by that single instruction. – David Rodríguez - dribeas Oct 03 '12 at 02:50
  • Several solutions including intrinsics are shown here: http://stackoverflow.com/questions/105252/how-do-i-convert-between-big-endian-and-little-endian-values-in-c – Pixelchemist Jul 06 '13 at 18:45

1 Answers1

0

First of all you need to check if your hardware platform have byte swap instructions or not. Some platforms have these instructions, some of them not. After that you need to look for library function that uses them. Check the docs or stop in the debugger and look at the disassembly. It is a good chance that you will find one. It is unlikely that anything else will work better than this.

Ultimately write your own function in assembler that uses these instructions.

For a 2-byte type a straight table conversion will work. This is 128 kb that is not that much for our days computers. For 32 bit types this is close to overkill but in some (rare) cases may still work on a big 64-bit box.

You can also use combination of asm instructions, table conversion and optimized loop.

Kirill Kobelev
  • 10,252
  • 6
  • 30
  • 51