-2

I have this code:

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <inttypes.h>

int main (int argc, char** argv) {

   *(volatile uint8_t*)0x12345678u = 1;
   int var = *(volatile uint8_t*)0x12345678;
   printf("%i", var);
   printf("%i", &var);

   return (EXIT_SUCCESS);
}

I want to see a 1 and the address of that int, which i specified previously. But when compiled by gcc in bash, only "command terminated" without any error will be shown. Does anyone know why so?

PS: I am newbie to C, so just experimenting.

Marco Bonelli
  • 63,369
  • 21
  • 118
  • 128
  • Why do you think this memory location is even accessible? You can't just mess with random memory locations when it is managed by OS and MMU on different levels. – Eugene Sh. Aug 22 '19 at 18:36
  • 3
    You've got lots to learn about [how memory works](https://en.wikipedia.org/wiki/Virtual_memory) on modern computer systems. – user3386109 Aug 22 '19 at 18:53

3 Answers3

4

What you are doing:

*(volatile uint8_t*)0x12345678u = 1;
int var = *(volatile uint8_t*)0x12345678;

is totally wrong.

You have no guarantee whatsoever that an arbitrary address like 0x12345678 will be accessible, not to mention writable by your program. In other words, you cannot set a value to an arbitrary address and expect it to work. It's undefined behavior to say the least, and will most likely crash your program due to the operating system stopping you from touching memory you don't own.

The "command terminated" that you get when trying to run your program happens exactly because the operating system is preventing your program from accessing a memory location it is not allowed to access. Your program gets killed before it can do anything.


If you are on Linux, you can use the mmap function to request a memory page at an (almost) arbitrary address before accessing it (see man mmap). Here's an example program which achieves what you want:

#include <sys/mman.h>
#include <stdio.h>

#define WANTED_ADDRESS (void *)0x12345000
#define WANTED_OFFSET 0x678 // 0x12345000 + 0x678 = 0x12345678

int main(void) {
    // Request a memory page starting at 0x12345000 of 0x1000 (4096) bytes.
    unsigned char *mem = mmap(WANTED_ADDRESS, 0x1000, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);

    // Check if the OS correctly granted your program the requested page.
    if (mem != WANTED_ADDRESS) {
        perror("mmap failed");
        return 1;
    }

    // Get a pointer inside that page.
    int *ptr = (int *)(mem + WANTED_OFFSET); // 0x12345678

    // Write to it.
    *ptr = 123;

    // Inspect the results.
    printf("Value  : %d\n", *ptr);
    printf("Address: %p\n", ptr);

    return 0;
}
Marco Bonelli
  • 63,369
  • 21
  • 118
  • 128
  • but why i am not able to write my data to arbitrary address i want? Why does OS stop me from this, if the address is free of not using by any other job? – Patrik Patan Pastyyr Aug 22 '19 at 18:56
  • @PatrikPatanPastyyr because you have to first *ask* the OS for the memory before using it. The OS does not magically give you access to all the memory of your computer. – Marco Bonelli Aug 22 '19 at 18:57
  • For the same reason you can't break into a random apartment even if nobody is home. *It is not yours*. Worth noting, that if we are speaking of a bare-metal system without MMU enabled, this might work given the address is known to be within the memory range. – Eugene Sh. Aug 22 '19 at 18:58
  • @patrick `Why does OS stop me from this` - because if it didn't, any malware could read all the passwords stored in your chrome. And check your browser history. Happily, it can't. – KamilCuk Aug 22 '19 at 19:04
  • Well that sucks. I see my OS manage my memory usage, however how does the algoritmh for allocating memory space work? if it is so randomized? – Patrik Patan Pastyyr Aug 22 '19 at 19:09
  • @PatrikPatanPastyyr I'm not sure I understand what you are asking. If you are asking about how dynamic memory allocation works (for example `malloc()`) I would strongly suggest you to first do a Google search, and if you encounter any problems writing programs, write another question since that is a different question than the one you posted. If you are asking about memory layout randomization... that's another different topic, see [here](https://en.wikipedia.org/wiki/Address_space_layout_randomization) for a start. – Marco Bonelli Aug 22 '19 at 19:10
  • @PatrikPatanPastyyr In my book, that doesn't suck -- it's wonderful, it's they way things ought to be. If on the other hand you want to be able to do whatever you want, whenever you want, you need an embedded system with no OS and no MMU. (And such systems do exist, and on them you can do, yes, just about whatever you want.) – Steve Summit Aug 22 '19 at 19:49
  • @SteveSummit what kind of book do you mean? – Patrik Patan Pastyyr Aug 22 '19 at 19:52
  • @PatrikPatanPastyyr Figure of speech. In my experience, in my opinion, in the kind of work that I do. I gladly accept the limitations placed on me by by the OS, the MMU, and by not being root. Those restrictions cost me either not at all, or their cost is repaid a thousandfold in security, reliability, and robustness. (But we're getting into opinion territory.) – Steve Summit Aug 22 '19 at 21:26
  • @PatrikPatanPastyyr: Writing to any arbitrary address *that your program doesn't explicitly own* can have all kinds of bad effects - you could overwrite part of the machine code (leading to a crash or unexpected behavior), you could overwrite part of your runtime stack (a very popular malware exploit), you could overwrite system bookkeeping data (such as your `malloc` arena), etc. If you *need* extra memory, you need to either use the standard library functions (`malloc`/`calloc`/`realloc`) or use a system call like `mmap`. – John Bode Aug 22 '19 at 21:26
0

The operating system and loader do not automatically make every possible address available to your program. The virtual address space of your process is constructed on demand by various operations of the program loader and of services inside the process. Although every address “exists” in the sense of being a potential address of memory, what happens when a process attempts to access an address is controlled by special data structures in the system. Those data structures control whether a process can read, write, or execute various portions of memory, whether the virtual addresses are currently mapped to physical memory, and whether the virtual addresses are not currently mapped to memory but will be provide with physical memory when needed. Initially, much of a process’ address space is marked not in use (or at least implicitly marked, in that none of the explicit records for the address space apply to it).

In the executions of your program you have attempted so far, the address 0x12345678 has not been mapped and marked available to your process, so, when your process attempted to use it, the system detected a fault and terminated your process.

(Some systems randomize the layout of the address space when a program is being loaded, to make it harder for an attacker to exploit bugs in a program. Because of this, it is possible that 0x12345678 will be accessible in some executions of your program and not others.)

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
0

The quote from C11 standard 6.5.3.2p4:

4 The unary * operator denotes indirection. [...] If an invalid value has been assigned to the pointer, the behavior of the unary * operator is undefined.

You use * operator on (volatile uint8_t*)0x12345678u pointer. Is this a valid pointer? Is it invalid pointer? What is an "invalid value" of a pointer?

There is no check that allows to find out which particilar pointer values are valid, which aren't. It is not implemented in C language. A random pointer may just happen to be a valid pointer. But most, most probably it is an invalid pointer. In which case - the behavior is undefined.

Dereferencing an invalid pointer is undefined behavior. But - outside of C scope and into operating system - on *unix systems trying to access memory that you are not allowed to, should raise a signal SIGSEGV on your program and terminate your program. Most probably this is what happens. Your program is not allowed to access memory location that is behind 0x12345678 value, the operating system specifically protects against that.

Also note, that systems use ASLR, so that pointer values within your program are indeed in some degree random. There are not linear, ie. *(char*)0x01 will not access the first byte in your ram. Operating system (or more exact, the underlying hardware as configured by the operating system) translates pointer values in your program to physical location in ram using what is called virtual memory. The same pointer values may just happen to be valid on the second run of your program. But most probably, because pointers can have so many values, most probably it isn't a valid pointer. Your operating system kills your program, as it detects an invalid memory access.

KamilCuk
  • 120,984
  • 8
  • 59
  • 111
  • what the hack? I hardly understand. I am just with the idea, that pointers are just a int to point to addres of another int, but really dont understand how it could be "RANDOM"? That is insane isnt? – Patrik Patan Pastyyr Aug 22 '19 at 18:53
  • Then read the wikis about virtual memory and ASLR, it's explained there better then I every could ; ) So `(char *)rand()` may _just happen_ to work. But, what are the changes to roll from 2^64 address space a number within your process address space (let's assume 4K)? Och right - it's 0.00000000000002%. Imagine it like this: there is a special register in CPU, that adds a magic constant to memory locations in your program. That special register is configured by your operating system. And it's random for each program with ASLR. And you can't read the value in that register. – KamilCuk Aug 22 '19 at 18:57
  • I mean i will definitly read something about that ASLR (even thought know nothing about it so far), but why is it even there? the magic CPU register that gives a constant (of what type by the way), does it have special purpose or? – Patrik Patan Pastyyr Aug 22 '19 at 19:15
  • @KamilCuk maybe I being pedantic here, but ASLR has got nothing to do with virtual memory. Infact in ASLR, the virtual addresses themself are random and not the mapping to physical memory. – Ajay Brahmakshatriya Aug 22 '19 at 19:17
  • @KamilCuk also, it doesn't matter even if `(volatile uint8_t*)0x12345678u` happens to be valid pointer. Dereferencing pointers *not* derived from `&` (address of) or `malloc` and family of functions itself is undefined behavior, even if the value is a valid pointer location. – Ajay Brahmakshatriya Aug 22 '19 at 19:19
  • @Ajay `even if the value is a valid pointer location` - I have quoted the standard for you. The standard doesn't define what a valid pointer is. If it is valid, then it's valid, assuming alignment is ok and stars are in proper positions, you can dereference it. There were some work about the [pointer provenance](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2090.htm), we'll see if it makes to the standard. So is `int a; int *b = rand(); if (&a == b) { printf("%d\n", *b); }` invalid? This is a thin line. – KamilCuk Aug 22 '19 at 19:27
  • @AjayBrahmakshatriya I believe the best answer would be: a random pointer can be valid, can be not, this is implementation defined. Your statement: "Dereferencing pointers not derived from & (address of) or malloc is undefined behavior" just assumes that program should track where pointers come from. And, outside of that, it would make 100% programs for microcontrollers and bare-metal targets currently be undefined behavior. There was [this extension](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1169.pdf) but i didn't seen anyone adopting it. – KamilCuk Aug 22 '19 at 19:29
  • If I understand the standard correctly, `int a; int *b = rand(); if (&a == b) { printf("%d\n", *b); }` should always be undefined because `b` is not derived from a valid pointer. Regarding "...just assumes that program should track where pointers come from..." -- no, it doesn't *have* to, because the behavior is undefined not explicitly supposed to report an error. – Ajay Brahmakshatriya Aug 22 '19 at 19:33
  • Regarding microcontrollers and bare-metal targets - yes, they do violate the standard from time to time. – Ajay Brahmakshatriya Aug 22 '19 at 19:35
  • @KamilCuk thanks for the pointer on pointer provenance. I will try to go through it and figure out if my current understanding is wrong. – Ajay Brahmakshatriya Aug 22 '19 at 19:36
  • @AjayBrahmakshatriya what the hack means pointer does have undefined behaviour? So what is REALLY pointer (not just explanation school gives me, like pointing to some address), i mean if it is like you are saying - it has undefined behaviour, then how it was created and how it internally works? I dont believe, that pointer - created by human - does have undefined behaviour and noone knows he it behaves. Just dont understand yout statement. – Patrik Patan Pastyyr Aug 22 '19 at 19:44
  • @PatrikPatanPastyyr "Dereferencing pointer not derived from `&` or `malloc` is undefined behavior". "Dereferencing pointer" means applying `*` operator on a pointer. – KamilCuk Aug 22 '19 at 19:47
  • @KamilCuk i didnt really mean what dereferencing pointer means, but rather what is meant by the statment "undefined behaviour" WHAT is undefined behaviou? it is just random? A pointer dereferencing is just random? or? – Patrik Patan Pastyyr Aug 22 '19 at 19:55
  • @PatrikPatanPastyyr Maybe this will help: https://stackoverflow.com/questions/2397984 – user3386109 Aug 22 '19 at 20:14