3

From what I have read, misaligned access means mostly two things:

  1. you may get a performance loss
  2. you will lose atomicity of loads and stores that aligned access has

Supposing that performance is not an issue and what I want from software is correctness, how bad is misaligned access? My understanding is that the x86 CPU will handle such accesses correctly but may have to do additional work to fetch the data.

What lead me to asking this question was compiling my code with -fsanitize=undefined. I got many errors about misaligned stores/loads. I am not sure if this is an issue because:

  1. the stores are performed only during data preparation which is a single-threaded process, so I am not concerned about loss of atomicity
  2. the loads are performed in a multithreaded process where many threads (four or more) access the data, but the data is not modified by any of them (held in a const uint8_t* variable)

The reason accesses are not aligned is that the const uint8_t* array contains bytes from many different types (uint8_t, uint16_t, uint32_t, uint64_t, and int64_t).

I am sure that no load goes outside the bounds of the allocated uint8_t array (e.g. the program never loads uint64_t from an address that points to last one, two, or three bytes of the allocated memory block), and am sure that my accesses are all correct - only misaligned.

Another thing I read is that such loads may be breaking the strict aliasing rules, but the code compiles without a single warning with -Wstrict-aliasing -Werror (which I have long ago enabled).

Should I pad the data in the uint8_t array to ensure accesses are aligned, or may I safely ignore the warnings?

Maelkum
  • 269
  • 4
  • 14
  • 3
    The data in the array can only be access by std::memcpy (into/from the target data type) if you wish to avoid undefined behaviour. Direct access by ponters (other than `uint8_t*` and `char*`) is UB. – Richard Critten Jun 25 '17 at 11:39
  • 4
    *"My understanding is that the CPU will handle such accesses correctly but may have to do additional work to fetch the data."* **Which CPU?** x86 is fine with unaligned accesses, but on other CPUs, they are fatal errors. On yet other CPUs, it is user-selectable whether misalignment is a fatal error or whether it will be fixed up at a massive performance cost. – Cody Gray - on strike Jun 25 '17 at 11:53
  • @CodyGray thanks, I clarified the question. – Maelkum Jun 25 '17 at 11:56
  • @RichardCritten Thanks, if you'd write your comment as an answer I will accept it. Using std::memcpy() fixes warnings raised by the sanitizer. – Maelkum Jun 25 '17 at 11:58

2 Answers2

3

There are platforms which don't support unaligned access (you will get a crash). And, there are platforms, where unaligned access is supported, but there are some asm instructions, which need aligned access. For example, on ARM, there is LDRD instruction, which needs aligned memory address. And, unfortunately, compiler is free to use this instruction. But, usually, there is a compiler extension which tells the compiler that the pointer is unaligned, so it won't use LDRD.

On platforms which support UA, there are the penalties you mentioned.

I recommend you to use memcpy. It works on all platforms, and compilers are pretty good nowadays to optimize it (so you won't get memcpy calls, but fast mov instructions).

mloskot
  • 37,086
  • 11
  • 109
  • 136
geza
  • 28,403
  • 6
  • 61
  • 135
  • 1
    *"compilers are pretty good nowadays to optimize it (so you won't get memcpy calls, but fast mov instructions)"* Well, some compilers. MSVC doesn't do this optimization. But then, it doesn't have `-fsanitize=undefined`, and it doesn't take every opportunity afforded it by the standard to optimize in a way that might break code (in other words, its optimizer is [well distinguished from an adversary](https://blog.regehr.org/archives/970)). – Cody Gray - on strike Jun 25 '17 at 12:04
  • @Cody: which version doesn't do this? I've checked VS2015 (compiler version 19.00.24213.1), and it optimized it – geza Jun 25 '17 at 12:08
  • Then basically every version prior to VS 2015! There were some big changes made to the optimizer recently, though, so this may be among them. Nice to see that. But I remain skeptical of how good it is at detecting the optimization possibility. The test code you wrote was probably very simple. I've disassembled real code and found it *not* applying the optimization. – Cody Gray - on strike Jun 25 '17 at 12:11
  • @Cody: yep, it was simple. Why are you skeptical? As I understand, detecting and applying this optimization is easy – geza Jun 25 '17 at 12:14
1

The main problem isn't performance or atomicity, it is correctness. Misaligned accesses invoke undefined behavior according to the C and C++ standards, hence you can't rely on any particular outcome. It may work, or it may crash. Or it may work first, and stop working sometime later. This is the essence of the error messages you get. You may choose to ignore the errors if you know that it will always work for you, but since you asked for such errors to be flagged, by using the corresponding compiler switch, it is only reasonable that you should strive to avoid them, especially if you are not absolutely sure that your code will stay on this platform forever. Furthermore, how do you know it'll always work for you even on the same platform?

From what you write, it seems that the data is written by the same machine that reads it later, only the threads are different. If so, you should attempt to write data in a properly aligned way, i.e. use padding where appropriate. You may be able to get help by the compiler by packaging the data in a properly defined struct rather than an unstructured buffer. This will also give you more type safety.

Otherwise you would have to worry about more than just alignment. For example, you would also need to take endianness into account. In this case you are probably writing a kind of external data record that might end up on a different machine. You are looking for a machine-neutral external data representation, which you can define yourself, or better you use one of several standard representations that have been invented for RPC, which has the advantage that you can find libraries to do the reading and writing.

sh-
  • 941
  • 6
  • 13
  • I am aware of the endianness issue. I don't think I can get help from the compiler, given that I kinda-sorta need the data to be an unstructured buffer. – Maelkum Jun 25 '17 at 13:17
  • Then you may want to have a look at [std::align](http://en.cppreference.com/w/cpp/memory/align) and [alignof](http://en.cppreference.com/w/cpp/language/alignof). – sh- Jun 25 '17 at 13:38
  • 1
    The Standard recognizes the legitimacy of non-portable programs, and the authors recognize the language's usefulness in many fields stems from implementations' support of features not mandated by the Standard, including the ability to behave as a "high-level assembler" for various platforms. Code which specifies that it is intended for use only on compilers that seek to be usable in such fashion may not be portable to compilers that aren't suitable for such purpose, but that hardly makes it "incorrect". – supercat Jun 22 '18 at 15:00