17

I want to emulate the system with prohibited unaligned memory accesses on the x86/x86_64. Is there some debugging tool or special mode to do this?

I want to run many (CPU-intensive) tests on the several x86/x86_64 PCs when working with software (C/C++) designed for SPARC or some other similar CPU. But my access to Sparc is limited.

As I know, Sparc always checks alignment in memory reads and writes to be natural (reading a byte from any address, but reading a 4-byte word only allowed when address is divisible by 4).

May be Valgrind or PIN has such mode? Or special mode of compiler? I'm searching for Linux non-commercial tool, but windows tools allowed too.

or may be there is secret CPU flag in EFLAGS?

osgx
  • 90,338
  • 53
  • 357
  • 513

4 Answers4

13

I've just read question Does unaligned memory access always cause bus errors? which linked to Wikipedia article Segmentation Fault.

In the article, there's a wonderful reminder of rather uncommon Intel processor flags AC aka Alignment Check.

And here's how to enable it (from Wikipedia's Bus Error example, with a red-zone clobber bug fixed for x86-64 System V so this is safe on Linux and MacOS, and converted from Basic asm which is never a good idea inside functions: you want changes to AC to be ordered wrt. memory accesses.

#if defined(__GNUC__)
# if defined(__i386__)
    /* Enable Alignment Checking on x86 */
    __asm__("pushf\n orl $0x40000,(%%esp)\n popf" ::: "memory");
# elif defined(__x86_64__) 
     /* Enable Alignment Checking on x86_64 */
    __asm__("add $-128, %%rsp \n"    // skip past the red-zone, in case there is one and the compiler has local vars there.
            "pushf\n"
            "orl $0x40000,(%%rsp)\n"
            "popf \n"
            "sub $-128, %%rsp"       // and restore the stack pointer.
           ::: "memory");       // ordered wrt. other mem access
# endif
#endif

Once enable it's working a lot like ARM alignment settings in /proc/cpu/alignment, see answer How to trap unaligned memory access? for examples.

Additionally, if you're using GCC, I suggest you enable -Wcast-align warnings. When building for a target with strict alignment requirements (ARM for example), GCC will report locations that might lead to unaligned memory access.

But note that libc's handwritten asm for memcpy and other functions will still make unaligned accesses, so setting AC is often not practical on x86 (including x86-64). GCC will sometimes emit asm that makes unaligned accesses even if your source doesn't, e.g. as an optimization to copy or zero two adjacent array elements or struct members at once.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Yann Droneaud
  • 5,277
  • 1
  • 23
  • 39
  • 2
    A note for anyone using this on recent Linux: the C library will crash in strcmp, which is used in the dynamic loader. So do `export LD_BIND_NOW=1` before running, so that ld.so will resolve all library symbols at startup instead of on demand. – Zan Lynx Jun 17 '16 at 18:53
  • 1
    There is also 'STAC' instruction "0F 01 CB" - http://www.felixcloutier.com/x86/STAC.html "Sets the AC flag bit in EFLAGS register. This may enable alignment checking of user-mode data accesses." – osgx Nov 10 '16 at 16:06
  • 1
    @osgx: `stac` was new with the SMAP (Supervisor Mode Access Prevention) feature (Broadwell?), and is illegal at privilege level > 0. i.e. it faults in user-space. User-space has to continue using pushf/popf to set AC for itself. IDK why they decided not to let stac/clac decode in user-space since it's something user-space can do using the stack. – Peter Cordes Apr 29 '20 at 09:55
  • 1
    Warning: Your code to enable it on x86_64 will clobber the red zone under gcc's nose. – Joseph Sible-Reinstate Monica Jun 28 '20 at 16:25
  • 1
    @JosephSible-ReinstateMonica: good point, I fixed that here. The code on wikipedia is now on the Bus Error article; I might get around to updating that page, too. – Peter Cordes Dec 21 '20 at 00:48
8

It's tricky and I haven't done it personally, but I think you can do it in the following way:

x86_64 CPUs (specifically I've checked Intel Corei7 but I guess others as well) have a performance counter MISALIGN_MEM_REF which counter misaligned memory references.

So first of all, you can run your program and use "perf" tool under Linux to get a count of the number of misaligned access your code has done.

A more tricky and interesting hack would be to write a kernel module that programs the performance counter to generate an interrupt on overflow and get it to overflow the first unaligned load/store. Respond to this interrupt in your kernel module but sending a signal to your process.

This will, in effect, turn the x86_64 into a core that doesn't support unaligned access.

This wont be simple though - beside your code, the system libraries also use unaligned accesses, so it will be tricky to separate them from your own code.

gby
  • 14,900
  • 40
  • 57
  • 2
    "kernel module that programs the performance counter to generate an interrupt" - isn't it a mode of perf/oprofile when we doing profiling? (`perf record -e MISALIGN_MEM_REF:u -c 1`.) And perf already has code to separate libraries and user code. The interrupt from perf will not stop the program; but perf will record where unaligned access was. I think this mode can be more helpful then killing program and do one-by-one fixes. – osgx Aug 07 '12 at 09:34
  • 2
    @osgx you are correct. If generating an exception-like interrupt in the same way that would happened on a CPU that does not support unaligned load/store is not important, "perf record -e MISALIGN_MEM_REF:u -c 1" can be used to find every location in the program that does them, I agree. – gby Aug 07 '12 at 11:19
  • @osgx, for what version of perf does your above command work? I have to use `-e alighment-faults` on my perf_3.13 (Ubuntu 14.04) but it never records any actual faults for my test code with explicit faults in it. – Nathan Kidd Sep 18 '14 at 20:40
  • Nathan Kidd, don't use high-level event "alignment-faults" of perf (it is not mapped to anything on x86), find a raw hardware perf event of your CPU. Not every Intel CPU has the event MISALIGN_MEM_REF. – osgx Sep 18 '14 at 21:11
6

Both GCC and Clang have UndefinedBehaviorSanitizer built in. One of those checks, alignment, can be enabled with -fsanitize=alignment. It'll emit code to check pointer alignment at runtime and abort if unaligned pointers are dereferenced.

See online documentation at:

malat
  • 12,152
  • 13
  • 89
  • 158
lights0123
  • 221
  • 1
  • 4
  • 10
  • 2
    Nice, that should catch C source-level misaligned pointers without tripping over potentially-unaligned accesses that compilers generate on purpose when optimizing narrow aligned accesses for a platform with fast known-safe unaligned access (like x86). Also, memcpy and other libc functions use unaligned accesses in hand-written asm, (e.g. for small non-power-of-2 sized copies in glibc). So enabling x86's AC flag generally isn't usable – Peter Cordes Dec 21 '20 at 00:53
0

Perhaps you somehow could compile to SSE, with all aligned moves. Unaligned accesses with movaps are illegal and probably would behave as illegal unaligned accesses on other architechtures.

Jens Björnhager
  • 5,632
  • 3
  • 27
  • 47
  • not every operation in my code is vectorizable, I think. And task is to find all unaligned accesses. – osgx Aug 07 '12 at 00:18
  • 1
    You don't need to vectorize code to use SSE, it can do scalar arithmetic. – Jens Björnhager Aug 07 '12 at 01:43
  • @JensBjörnhager. Yes, but only 16-byte loads and stores have alignment-required versions like `movaps` and `movdqa`. Narrow instructions like `movss` (scalar single), `movsd` (scalar double) and `movd`/`movq` are just like regular GP-integer `mov`, not requiring any alignment. (Unless you enable the AC flag.) Of course, if GCC *knows* a pointer may not be aligned by 16, it will auto-vectorize with `movups` instead. Even if it's known to be aligned by 4 or 8. – Peter Cordes Dec 21 '20 at 00:56