18

I am debugging a C program and am gravely confused about the lower half of the AddressSanitizer outputs when it finds problems. Let's use this for example:

==33184==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000000005 at pc 0x55f312fe2509 bp 0x7ffc99f5f5c0 sp 0x7ffc99f5f5b0
WRITE of size 1 at 0x602000000005 thread T0
    #0 0x55f312fe2508 in main /home/user/c/friends/main.c:20
    #1 0x7fa5ea0e9b96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)
    #2 0x55f312fe21c9 in _start (/home/user/c/friends/cmake-build-debug/friends+0x11c9)

0x602000000005 is located 11 bytes to the left of 5-byte region [0x602000000010,0x602000000015)
allocated by thread T0 here:
    #0 0x7fa5eb2b8b40 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xdeb40)
    #1 0x55f312fe23f4 in main /home/user/c/friends/main.c:18
    #2 0x7fa5ea0e9b96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)

SUMMARY: AddressSanitizer: heap-buffer-overflow /home/user/c/friends/main.c:20 in main

  0x0c047fff7fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c047fff8000:[fa]fa 05 fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8010: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8020: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8030: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==33184==ABORTING

Everything above this line, I understand: SUMMARY: AddressSanitizer: heap-buffer-overflow /home/user/c/friends/main.c:20 in main

My question involves the data presented below that line. I read this answer but it did not answer my question. The memory dump shown by ASAN looks like this:

  0x0c047fff7fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c047fff8000:[fa]fa 05 fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8010: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8020: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8030: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  1. What is the line with the arrow trying to tell me? My assumption is that 05 which appears between the fas is referring to the 0x602000000005 is located 11 bytes to the left of 5-byte region "5-byte region." However, I am still confused because the legend says that fa means "heap left redzone," yet it appears to the right of the 05 and to the left of it. Why are there no "heap right redzones?"

  2. In this example, ASAN says that the program went 11 bytes out of the 5-byte region, yet it shows far more fas than that.

  3. Is there any proper, detailed documentation which actually explains what these terms "heap left redzone", "stack mid redzone", "Global redzone", etc mean? I've not been able to find any.

  4. What is a "Shadow byte/address" in this context?

the_endian
  • 2,259
  • 1
  • 24
  • 49
  • have you checked https://en.wikipedia.org/wiki/Shadow_memory ? – bolov May 08 '20 at 07:56
  • Maybe you could write a series of test programs in which you write on purpose something known at a known out of bounds location? After some experiments you could be able to understand what to expect in different out of bounds scenarios.. – Roberto Caboni May 08 '20 at 08:22
  • @RobertoCaboni Even that was difficult because ASAN aborts the program immediately so I cannot examine the precise memory addresses in gdb after I see the output error msg. Their docs explaining how to break before exit in gdb are outdated and from 2016. I may be able to disable ASLR and just run the same program a few times over and cross refs the address in a non-ASAN version, but they dont make it easy. – the_endian May 08 '20 at 08:28
  • If it aborts, you have a core dump of the complete program state including memory, right? `ulimit` permitting, obviously. – Useless May 10 '20 at 21:59
  • @Useless actually I do not. I keep getting `Coredump entry has no core attached (neither internally in the journal nor externally on disk).` from coredumpctl. This is aseparate issue so I'll refrain from any more on that here. – the_endian May 12 '20 at 04:21
  • eww, systemd, bad luck. Normally I'd assume that's either your `ulimit -c` being too small or the target partition filling up, but systemd may have more ways to complicate things. – Useless May 12 '20 at 15:17

1 Answers1

33

What are “shadow bytes” in AddressSanitizer and how should I interpret them?

From the AddressSanitizerAlgorithm page on GitHub (which is also linked from the LLVM AddressSanitizer page):

The virtual address space is divided into 2 disjoint classes:

  • Main application memory (Mem): this memory is used by the regular application code.
  • Shadow memory (Shadow): this memory contains the shadow values (or metadata). There is a correspondence between the shadow and the main application memory. Poisoning a byte in the main memory means writing some special value into the corresponding shadow memory.

So "shadow bytes" are metadata describing the state of your program's addressable memory.

If we look at the asan output:

Shadow byte legend (one shadow byte represents 8 application bytes):

it tells us that the hexdump is of the shadow memory which describes the state of your program's "real" memory. What states does it track?

  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  ...

so if a whole 8-byte line is addressable, the shadow byte that tracks (or shadows) it should have value 00. If it's partly-addressable, the shadow byte will be 01..07, which is presumably the number of addressable bytes in the line.

The value the hex dump is pointing you to is fa, or "Heap left redzone" - presumably this is some kind of guard region around heap allocations to detect overruns.

From the same link:

The run-time library replaces the malloc and free functions. The memory around malloc-ed regions (red zones) is poisoned

More broadly, this description (in program addresses)

0x602000000005 is located 11 bytes to the left of 5-byte region
  [0x602000000010,0x602000000015)

matches the shadow map shown:

=>0x0c047fff8000:[fa]fa 05 fa ...

Assuming natural alignment,

  • shadow byte 0x0c047fff8000 describes (or, again, shadows) program addresses 0x602000000000..0x602000000007 which includes the address you accessed
  • the next shadow byte at 0x0c047fff8001 describes program addresses 0x602000000008..0x60200000000F
  • both of those have value fa, meaning "Heap left redzone"
  • the next shadow byte at 0x0c047fff8002 describes program addresses 0x602000000010..0x602000000007 and has value 05, meaning 5 bytes are addressable. These are the 5 bytes of your heap allocation.

All of this is consistent with the part of the error you did understand.

  1. However, I am still confused because the legend says that fa means "heap left redzone," yet it appears to the right of the 05 and to the left of it. Why are there no "heap right redzones?"

    I don't know what the directionality really means, here. Heaps typically grow in one direction initially (traditionally up as the stack grows down), but can be fragmented, released, coalesced and re-allocated. Is the gutter between two allocations "right," or "left," or both, or neither? All we need to know is that it's a poisoned heap region that was never allocated to the user.

    Maybe it should just be "Heap redzone", if there is no orientation corresponding to the stack left/mid/right values.

  2. In this example, ASAN says that the program went 11 bytes out of the 5-byte region, yet it shows far more fas than that.

    each fa represents eight bytes, as the legend says. So if you'd accessed anything from nine to fifteen bytes before the allocation (modulo arithmetic errors), it would have shown up in the same shadow byte. If you'd accessed one to eight bytes before, it would have shown up in the next shadow byte (right before the 05).

    The rest of the fas are just a map of the surrounding area, which doesn't appear helpful in this case but might be in others.

  3. Is there any proper, detailed documentation which actually explains what these terms "heap left redzone", "stack mid redzone", "Global redzone", etc mean?

    No idea. They seem to follow fairly naturally from the use case though - you hit a red zone = you accessed an address you shouldn't. You can always just read the code, eg. asan_internal.h defines the kAsanHeapLeftRedzoneMagic value, and asan_allocator.cpp poisons shadow bytes with it.

  4. What is a "Shadow byte/address" in this context?

    Just for completeness, a shadow byte is a byte that shadows a group of eight normally-accessible program bytes and tracks some information about them useful to the sanitizer.

    A shadow address is the address of a shadow byte.

Useless
  • 64,155
  • 6
  • 88
  • 132
  • This was an excellent answer. I apologize that it took so long to award the bounty - I hadn't realized that I need to take any separate action other than selecting this as an answer. It seems like the system awarded you the bounty now though. :) – the_endian May 20 '20 at 00:12
  • 1
    No worries, it isn't a race :) – Useless May 20 '20 at 10:00