-1

I wrote a very simple program , out of curiosity , to read the first 1000 bytes ahead and behind a single-element array , just to see what values I'd get, and what to make of them.

#include <stdio.h>

int main(){
    char mem[1];
    printf("\n\tSeeking Ahead...\n\t %lld to %lld\n\n",mem,mem+1000);

    for(int i=0; i <= 1000; i++)
        printf("%lld ",mem[i]);

    printf("\n\n\tSeeking Behind...\n\t %lld to %lld\n\n",mem-1000,mem);

    for(int i=1000; i >= 0; i--;)
        printf("%lld ",*(mem-i));

    printf("\n\n-------END------\n\n");
    
    return 0;
} 

The reason for choosing "%lld" was that I had some vague idea that a 64-bit system would have 64-bit addresses , and hence long long may be appropriate (a 64-bit int).

I didn't use "%d" because that would give me -ve values because of int being too small, and "%x" or "%o" will also go crazy with values beyond the size of an unsigned int - for example, I'd get ffff... where int would read -ve values.

I'm aware this is basically U.B. as far as the C standard is concerned but nothing is truly random and so I'd like to know probable reasons why :

  • Some values, like 127 , 0, repeat consistently
  • Most of the values displayed are 8/10-digit with these digits fixed : 4294967...
  • Some apparently random 2 or 3 digit values float between this sea of large numbers , like 123, 18, 55, 96...

I'm not asking why these exact values appear, that would be impossible to answer, I'm asking why the general pattern of 0s, 8-10 digit numbers (with 7 common digits ?) and a few normal looking 2-3 digit values appear, and also how to make sense of these values ?

Also, only running this on MacOSX (haven't tried Windows), and with "%c" , upon ' seeking forward ', it returns actual characters like so :

executable_path=./memdump./memdumpTERM_PROGRAM=Apple_TerminalSHELL=/bin/bashTERM=xterm-256colorTMPDIR=/var/folders...

Why ?

Lundin
  • 195,001
  • 40
  • 254
  • 396
user13863346
  • 327
  • 2
  • 11
  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackoverflow.com/rooms/221277/discussion-on-question-by-user13863346-making-sense-of-values-at-memory-addresse). – Machavity Sep 10 '20 at 15:31

1 Answers1

1

The printf()-specifier %lld expects a long long as argument. If you provide a shorter variable, like you do, you only set a part of this argument and cause UB, in addition to the UB you cause with accessing an array out of bounds (yes i know UB is UB and there is not a different form of UB, but i want to explain why you get the values you get). On AMD64 the upper 32 bit of the value is probably the same as some random part of the program set the used register before and the char argument only changes the lower 32 bit part. Every value outside of the range of an int is because of this error. The mem[i] is a char, which is then promoted to an int. Because of that you will normally not get a value in the range of int but not in the range of char this way.

If you want to make that experiment use the right format specifier and i would suggest you use a hexadecimal format specifier. Using a unsigned char would also be smarter. It would still be UB, since it accesses memory out of bounds, but you will more likely print what is actually stored in the memory.

You can use negative values for the [] operator and this is well defined when you have a pointer that points in the middle or end of an array and the negative value is still inside the array. It is not in your case, since it is no longer inside the array but it still works. You can combine both loops to a single one.

  • Thanks a lot for your detailed answer ! I appreciate it a lot. Question - other than `%x` , what other format specifiers are appropriate ? What if I wanted to read decimal ? Is `%hhu` appropriate ? – user13863346 Sep 10 '20 at 12:21
  • `%hhu` is for `unsigned char`, this does not make sense for `printf()` because everything shorter than an `int` is promoted to `int`. You would have to use `%u`. – 12431234123412341234123 Sep 10 '20 at 12:30
  • of course `%hhu` is appropriate, nevertheless of the promoting. That's *exactly the point*. Even the standard states that explicitly: "Specifies that a following d, i, o, u, x, or X conversion specifier applies to a signed char or unsigned char argument **(the argument will have been promoted according to the integer promotions, but its value shall be converted to signed char or unsigned char before printing)**; or that a following n conversion specifier applies to a pointer to a signed char argument. " – Antti Haapala -- Слава Україні Sep 10 '20 at 14:03
  • @AnttiHaapala The compiler recommends `%hhn` and using `%hhu` gives odd results ... – user13863346 Sep 10 '20 at 14:40
  • @user13863346 `%hhn`??? you're not *dereferencing the pointer* I guess? – Antti Haapala -- Слава Україні Sep 10 '20 at 15:24
  • @AnttiHaapala I am , within the second for loop ( `printf("%lld ",*(mem-i));` ) -- however, it recommend it at the `printf(Seeking...)` statements where I wish to print `mem, mem+1000, mem-1000` – user13863346 Sep 10 '20 at 15:36
  • argh... those are pointers and they *must* be printed as `%p`... or else convert to `uintptr_t` and print with the `"...%" PRIdPTR "...` technique https://stackoverflow.com/questions/5795978/string-format-for-intptr-t-and-uintptr-t – Antti Haapala -- Слава Україні Sep 10 '20 at 15:41