14

In assembly language we have instructions like:

movl ax, [1000]

This allows us to access specific memory locations.

But in C can we do something similar to this?

I know inline assembly code using asm() will allow you to do this, but I would like to know about some C specific technique to achieve this.

I tried the following code and got segmentation error:

int *ptr=0xFE1DB124;
*ptr;

This again was confusing as the memory location was identified by the code given below:

int var;
printf("\nThe Address is %x",&var);

So the memory location is available, but I am still getting a segmentation fault.

Why?

S.S. Anne
  • 15,171
  • 8
  • 38
  • 76
Deepu
  • 7,592
  • 4
  • 25
  • 47
  • 4
    Modern OSs randomize memory section addresses (it makes some attacks more difficult), so if you restart program address of your variable might be different. – zch Mar 26 '13 at 13:28
  • Modern OSes do not reveal actual physical addresses to programs. Your `printf` will print a virtual address. I have no idea how you can get past this to get the actual address. Also, the OS will not let your program access memory outside its allocated boundaries. – Anish Ramaswamy Mar 26 '13 at 13:29
  • Which line of code caused the segmentation error ? – SteveP Mar 26 '13 at 13:29
  • @zch: But I thought we will get segmentation fault only if we access parts of the main memory containing system programs. – Deepu Mar 26 '13 at 13:30
  • 1
    The line where I tried to assign a specific address to the pointer ptr. int *ptr=0xFE1DB123; – Deepu Mar 26 '13 at 13:31
  • That line should be fine. Hovever when you did `*ptr = some number` that should fail. – RedX Mar 26 '13 at 13:38
  • 1
    @Deepu Depends on the OS. A good (=more secure) OS will give segmentation fault if you access memory that is not allocated to the process. – Klas Lindbäck Mar 26 '13 at 13:47
  • 1
    @Deepu: Are you sure you got that the address right? Most compilers will put `int` variables on an even address. Also see zch's comment on getting different addresses each execution. – Klas Lindbäck Mar 26 '13 at 13:50
  • @Klas Lindback: I am sorry. I corrected it, but the segmentation fault stands. – Deepu Mar 26 '13 at 13:54

4 Answers4

11

Common C compilers will allow you to set a pointer from an integer and to access memory with that, and they will give you the expected results. However, this is an extension beyond the C standard, so you should check your compiler documentation to ensure it supports it. This feature is not uncommonly used in kernel code that must access memory at specific addresses. It is generally not useful in user programs.

As comments have mentioned, one problem you may be having is that your operating system loads programs into a randomized location each time a program is loaded. Therefore, the address you discover on one run will not be the address used in another run. Also, changing the source and recompiling may yield different addresses.

To demonstrate that you can use a pointer to access an address specified numerically, you can retrieve the address and use it within a single program execution:

#include <inttypes.h>
#include <stdio.h>
#include <stdint.h>


int main(void)
{
    //  Create an int.
    int x = 0;

    //  Find its address.
    char buf[100];
    sprintf(buf, "%" PRIuPTR, (uintptr_t) &x);
    printf("The address of x is %s.\n", buf);

    //  Read the address.
    uintptr_t u;
    sscanf(buf, "%" SCNuPTR, &u);

    //  Convert the integer value to an address.
    int *p = (int *) u;

    //  Modify the int through the new pointer.
    *p = 123;

    //  Display the int.
    printf("x = %d\n", x);

    return 0;
}

Obviously, this is not useful in a normal program; it is just a demonstration. You would use this sort of behavior only when you have a special need to access certain addresses.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • Which part of (1) converting an integer value to a pointer and (2) accessing that address is an extension to the C standard?? (It always works, by design and by the standard, if the memory is accessible by the program and there is either an object there or you only read/write characters, in order to avoid aliasing and maybe object creation issues. Of course, address layout etc. is implementation defined.I think your program is perfectly valid standard C in every respect.) – Peter - Reinstate Monica Jan 27 '21 at 08:59
  • @Peter-ReinstateMonica: The C standard allows conversion of an integer to a pointer but does not define the result. In order to support using converted integers to access objects, a C implementation must extend the standard by providing a useful definition of the result. More abstractly, the fact that an address valid for use for a type `T` (e.g., has correct alignment) is converted to an `T *` does not mean there is actually a `T` object in the model of the C standard at that address, so, technically, the C implementation must also extend the standard by supporting these manufactured objects. – Eric Postpischil Jan 27 '21 at 12:38
  • In your example there *is* a known object there. "A value of integral type or enumeration type can be explicitly converted to a pointer. A pointer converted to an integer of sufficient size (if any such exists on the implementation) and back to the same pointer type will have its original value" (https://eel.is/c++draft/expr.reinterpret.cast) Yeah, the question is tagged C but if anything C++ is stricter. And the original value is `&x`. Also, implementation defined behavior (like the mapping of "invented" integers to addresses) is not really an "extension". – Peter - Reinstate Monica Jan 27 '21 at 14:06
  • OK; the C standard draft apparently does not make this round-trip guarantee and simply says "implementation defined" for both conversions plus some "intended to be consistent" footnote wording (6.3.2.3). But still, "implementation defined" is not an "extension"; your program is perfectly standard conformant. (It is also valid on all common systems. If you are in doubt, use a C++ compiler ;-) ). – Peter - Reinstate Monica Jan 27 '21 at 14:16
  • @Peter-ReinstateMonica: (a) Anything that adds a specification of a behavior that is not already specified in the C standard is an extension: It extends the language the implementation is supporting from some set S of specifications of the C standard to some set T that contains all of S plus a little more. – Eric Postpischil Jan 27 '21 at 15:47
  • @Peter-ReinstateMonica: (b) In general, a C implementation does not have to provide addresses to anything other than the objects defined in the language (with explicit definitions, allocations via `malloc`, and so on). It does not have to make other addresses in the memory space available for use in the program, so it does not have to provide conversions from integers to such addresses at all. To make it possible to convert integers to addresses and use them to access memory, a C implementation has to define a memory model (which is an extension of the C standard, since the standard… – Eric Postpischil Jan 27 '21 at 15:47
  • … does not define this, it has to define how the conversion from integers to addresses works (it is not necessarily a flat address space or just a copying of the bits of a number into the bits of a pointer; the conversion can include other information and perform other functions) (and the specification of this conversion is an extension of the C standard); and it has to support accessing memory through those objects even though there are no normally defined objects there (another extension of the C standard). – Eric Postpischil Jan 27 '21 at 15:49
  • @Peter-ReinstateMonica: (c) You may be accustomed to C implementations that are closely tied to the hardware and that present a model in C closely matching the hardware addressing (whether virtual or physical). However, a C implementation is not required to do this by the C standard. A C implementation can satisfy the requirement in C 2018 6.3.2.3 5 that the implementation defines the conversion of an integer to a pointer by defining it to produce a null pointer. For example, a C implementation might be designed merely to support abstract computing of outputs from inputs (e.g., solving math… – Eric Postpischil Jan 27 '21 at 15:53
  • … problems) and not to access arbitrary memory in the address space. Such an implementation does not need to support converting integers to pointers in any meaningful way, and the C standard does not require it to do so. So C implementations that do this are providing features beyond what the C standard requires: They are providing extensions to the C standard. – Eric Postpischil Jan 27 '21 at 15:55
  • Well, I took issue with "extension": Is specifying some implementation defined behavior an "extension"? To me extensions are what's listed e.g. on [this gcc page](https://gcc.gnu.org/onlinedocs/gcc/C-Extensions.html), like inline assembler or binary constants. I do see that C apparently doesn't make C++'s round trip guarantee pointer -> suitable_int -> pointer, so yes, there may theoretically be C implementations out there which do not yield the expected behavior here (but all C++ systems should). But still: Your program is entirely conformant standard C, that was my point. – Peter - Reinstate Monica Jan 27 '21 at 16:09
  • @Peter-ReinstateMonica: The sample program is not strictly conforming as the C standard defines it: It may produce different results in different C implementations. Notably, the `int *p = (int *) u;` declaration does not necessarily produce in `p` a pointer equal to `&x`. It is a conforming program as the C standard defines it. And it is given as an example of how integers can be converted to pointers and used in C implementations that support that (conforming code), not as an example that works in all C implementations (strictly conforming code). – Eric Postpischil Jan 27 '21 at 16:20
  • @Peter-ReinstateMonica: Re “Is specifying some implementation defined behavior an "extension"?”: Yes, definitely; the C standard describes it as such. One example is that C 2018 5.1.2.2.1 1 says `main` shall be defined “… or in some other implementation-defined manner”, and J.5.1 gives an example of an implementation defining a third parameter `char *envp[]` as an extension to the language. 6.4.2.1 says the number of characters in an identifier that are significant is implementation-defined. J.5.3 says making all characters significant is an extension. – Eric Postpischil Jan 27 '21 at 16:32
  • I'm glad I didn't say "strictly" then ;-). By the way, the wording in the latest draft (par. 4/9), "An implementation shall be accompanied by a document that defines all **implementation-defined and locale-specific characteristics** and all extensions" (emphasis by me), seems to indicate that the word to describe implementation-defined *things* is "characteristic". Ah, wrote that before your last comment... ok. – Peter - Reinstate Monica Jan 27 '21 at 16:40
6

For accessing Specific memory from user space, we have to map the memory Address to Programs Virtual Address using mmap(), the below C code shows the implementation:

Take a file "test_file" containing "ABCDEFGHIJ".

#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <fcntl.h>

int main(void)
{
    char *map_base_addr;  // Maping Base address for file
    int fd;         // File descriptor for open file
    int size = 10;

    fd= open("test_file", O_RDWR);  //open the file for reading and writing
    map_base_addr= mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);// Maping file into memory

    char *ch= map_base_addr;
    int i;

    /*Printing first 10 char*/
    for(i=0; i<size; i++)
            fputc(*(ch+i),stdout);
    printf("\n");

    *(ch+1) = 'b';
    *(ch+4) = 'z';
    *(ch+7) = 'x';

    /*Printing char after modification*/
    for(i=0; i<size; i++)
            fputc(*(ch+i),stdout);
    printf("\n");
    /* Finally unmap the file. This will flush out any changes. */
    munmap(map_base_addr, size);
    exit(0);
}

The output will be:

ABCDEFGHIJ
AbCDzFGxIJ
SomeOne
  • 497
  • 4
  • 9
  • what os is it? say in windows you need to lock the memory page to do that.I think it maps file from and to virtual memory. – Алексей Неудачин Dec 14 '16 at 12:55
  • POSIX [`mmap(2)`](http://man7.org/linux/man-pages/man2/mmap.2.html) doesn't deal with physical addresses. It deals with file offsets. – Peter Cordes Dec 14 '16 at 20:41
  • Also, I thought you were going to use MAP_FIXED to request the mapping at a specific address. That would put known memory contents at a known address so you could use a hard-coded pointer like the question want to. (Not that this would be a good idea in real code). – Peter Cordes Dec 14 '16 at 20:42
3

It works for me:

#include <stdio.h>

int main(int argc, char**argv) {
  int var = 7456;
  printf("Adress of var = %x, var=%d\n", &var, var);
  int *ptr = (int*)0x22cd28;
  printf(" ptr points to %x\n", ptr);
  *ptr = 123;
  printf("New value of var=%d\n", var);
  return 0;
}

Program output:

Adress of var = 22cd28, var=7456
 ptr points to 22cd28
New value of var=123

Note:

  1. The address is usually not the same on every execution. When I tried my example I had to run it three times before I got the address to match.

  2. char* can point to any adress (because sizeof (char) = 1). Pointers to larger objects must often be aligned on even adresses (usually one divisible by 4).

Klas Lindbäck
  • 33,105
  • 5
  • 57
  • 82
  • 1
    I am getting segmentation fault on *ptr=123; I use gcc 4.7.1 – Deepu Mar 26 '13 at 13:43
  • 1
    this is just sheer luck, or your program has full access to the complete memory map. – Evert Mar 26 '13 at 13:43
  • It is not answer to anything, you were just lucky. Run it 100 times, and see if they all succeed. – Andrey Mar 26 '13 at 13:49
  • 2
    @Evert,Andrey My point is that there is no magic in the pointer values. You can assign them arbitrary values, and if those arbitrary values point to valid memory addresses you can read and write to them. Hard-coding a specific value and relying on the OS to map the process to that particular address is of course just a matter of luck. – Klas Lindbäck Mar 26 '13 at 13:59
  • @Deepu If you are getting a segmentation fault then your pointer isn't pointing to a valid address. – Klas Lindbäck Mar 26 '13 at 14:01
  • 1
    @Andrey It will work if you can get the hardcoded address to match the address your OS assigns to the stack. The easiest way to get the address right is of course to use `ptr = &var`. Anyway, you are right about the poor argument. I added some notes to improve it. – Klas Lindbäck Mar 26 '13 at 14:08
  • "*... if an lvalue does not designate an object when it is evaluated, the behavior is undefined.*" That quote from the standard is saying that outside of your faery fantasy world, `*ptr` is rather chaotic. – autistic Mar 26 '13 at 14:26
  • 1
    @modifiable lvalue That something is undefined in the standard doesn't mean that the results are random. In the real world, each compiler will generate specific code that works in a specific way. In many instances this can be utilized as long as you are aware that the code is no longer portable, not even between different versions of the compiler. Sure, you can replace that code with assembler, but assembler code isn't portable either. – Klas Lindbäck Mar 26 '13 at 14:49
  • @KlasLindbäck "Chaos n. Behavior so unpredictable as to appear random, *owing to great sensitivity to small changes in conditions*." There are many definitions for "random". The problem I see is that this isn't *C code*, but *code written in a programming language subtly different to C, where undefined behaviour can be defined. Shall I drive the washing machine to the bank and deposit some raccoon vomit?* – autistic Mar 26 '13 at 16:02
  • @modifiable lvalue If you think that you never need to stray from standard C then you have never really gotten your hands dirty. Out there, in the real world, there are plenty of situations where standard C just doesn't cut it, especially when you do programming close to the hardware. Firmware drivers, embedded systems, or just old systems in general almost always stray from the standard. Without it, you wouldn't even have a washing machine. Make fun of it all you want, but such is the world we live in. – Klas Lindbäck Mar 26 '13 at 16:19
  • @KlasLindbäck If they stray from the standard and they call it "C" then they're setting themselves up for a case of gross neglect and manslaughter. I agree that there are cases where *external libraries* are required; That isn't straying from the standard. That's using *external libraries*. C is an *abstract programming language*; It may be translated to machine code by your compiler, but it *isn't* close to hardware. Consider the `sizeof` operator and VLAs for example, or forward declaration of structs; Do either of these exist "close to hardware"? No. What happens to storage duration? – autistic Mar 26 '13 at 23:27
  • @KlasLindbäck More likely, you experience behaviour that differs from system to system because *you* stray from the standard, and the result is *undefined behaviour*, because the standard tells you which guarantees are made in regards to consistent, predictable behaviour across every C implementation. It's unlikely that you have hardware or compilers that pre-exist C89. If you do, throw it out! It's far from optimal. It's highly probable that the op shop will give you a pentium with 133MHz CPU to work on. – autistic Mar 26 '13 at 23:33
  • If, however, you do come across a *recent* implementation that *claims* that it's a C implementation, and it doesn't ensure the guarantees of the C89, C99 or C11 standards are reasonably met, then you have something to take to the office of consumer and business affairs, or any court where they will say something like "This is dangerous! You're misleading people, giving them a different behaviour. What happens if their car crashes because of this?". A bike and a river doesn't constitute a washing machine, though they can be used for one; That doesn't mean it'd be appropriate to market that. – autistic Mar 26 '13 at 23:41
  • After all, let us consider that navigation software might be written and compiled to a GPS using such a compiler. What happens if software that works by the C standard (which is a legal document stating the requirements for a compiler to be labelled a "C compiler") fails to work? Who gets the blame when people get lost and killed at sea? The person who wrote the code according to the standard? They did their job, like any trade-person, and followed the rules. No. The compiler developers who produced a faulty compiler? Do you see what sort of guarantees the C standard gives you, now? – autistic Mar 26 '13 at 23:45
  • @modifiable lvalue You are either a troll or totally ignorant of the real world. Either way, I see no point in continuing this discussion. – Klas Lindbäck Mar 27 '13 at 06:57
  • I'm a troll?! Have you looked in the mirror, lately? – autistic Mar 27 '13 at 10:25
  • It will not always be possible to access memory by address, depending on the OS the kernel will not always allow you to do that because of memory access protection. So you will receive segmentation fault (for example for OS X for this example) – Developer Oct 13 '15 at 01:13
  • *Pointers to larger objects must often be aligned on even adresses (usually one divisible by 4).* Not required in x86 (except for 16B vectors with SSE aligned loads/stores), and this whole thing is obviously totally platform-specific. This would be a better answer if you make a bigger deal of pointing out that the pointer value is specific to your system, and was determined through experimentation. Starting off with "works for me" fails to address the question's misconception that the entire virtual address space is writeable. – Peter Cordes Dec 14 '16 at 20:29
  • @Seb: A C99 or C11 compiler can define the behaviour of whatever stuff it wants, except in cases where the ISO C standard requires the compiler to reject something. For example, a normal C implementation for 32-bit x86 uses pointers that are 32-bit and freely convertible from integers, in the way you'd expect if you know x86 asm. An implementation can and should document its implementation-specific behaviour along with its choices for things that the ISO standard requires implementations to define (e.g. CHAR_BIT and `sizeof(long)`). – Peter Cordes Dec 14 '16 at 20:36
  • Of course, on a normal modern system, predicting how a dereference of a hard-coded pointer will compile to x86 asm doesn't make it useful. Unless you do other stuff (like `mmap()` system calls), you won't know what's going to be there. And ASLR makes it inconsistent from run to run for parts of the address space covered by normal code, data, and stack space. – Peter Cordes Dec 14 '16 at 20:39
  • @PeterCordes Are we talking about C here, or x86? Well, there are no x86 tags in this question... so I wouldn't assume that's the topic. – autistic Jan 08 '17 at 03:26
  • @Seb: That was just an example. But the OP's asm example did use x86 asm. – Peter Cordes Jan 08 '17 at 03:29
  • @PeterCordes Are you aware of the possibility of bus errors on x86 architectures due to misaligned accesses? There's an example on [the Wikipedia topic page for bus errors](https://en.wikipedia.org/wiki/Bus_error)... Again, undefined behaviour, chaotic, etc... We're talking about C, not assembly, and C has undefined behaviour which is impacted by compilers, OSes, etc... Your x86 system is not the same as all. – autistic Jan 08 '17 at 03:34
  • @Seb: dereferencing a pointer initialized from a literal integer is always implementation-defined or undefined behaviour in C, so it only makes sense to talk about specific platforms anyway. But yes, systems like Solaris on SPARC will SIGBUS on unaligned pointers. Linux on x86 will not: unaligned SSE access will SIGSEGV. Unaligned accesses narrower than 16B will not succeed (unless they SIGSEGV for the usual unmapped-page reason). Different OSes on x86 could deliver different signals for unaligned SSE accesses. Or might even allow user-space to run with the AC flag set... – Peter Cordes Jan 08 '17 at 03:42
  • @PeterCordes The SIGBUS example on Wikipedia is for x86 Linux; you might want to re-evaluate your previous response: *"systems like Solaris on SPARC will SIGBUS on unaligned pointers. Linux on x86 will not"* <--- This in particular is invalid. – autistic Jan 13 '17 at 04:22
  • @Seb: unaligned *SSE* accesses are translated to SIGSEGV on x86 Linux, like I said. `#AC` faults from enabling the AC flag (like that wikipedia example does with pushf/popf) are translated into SIGBUS on x86 Linux as that example shows. But you can't safely do that on x86 GNU/Linux, because some glibc functions use unaligned accesses (e.g. `strlen` IIRC), so I don't consider AC-enabled to be relevant. Nobody does that. Misaligned SSE faults raise `#GP`, IIRC, and definitely not `#AC`. (See http://wiki.osdev.org/Exceptions). – Peter Cordes Jan 13 '17 at 07:49
  • @PeterCordes I'm not talking about unaligned *SSE* accesses; I'm talking about *other* unaligned accesses on x86. For the third and final time, please check the example I linked to. After you've done that, please read each comment you've posted here and filter through them, asking yourself "Does this contribute anything towards the answer?" and deleting any that don't. – autistic Jan 24 '17 at 04:36
2

Your question doesn't really make much sense if you are running on linux/windows/mac/whatever

http://en.wikipedia.org/wiki/Virtual_memory

You can do that only if you are programming a device without virtual memory, or if you are programming the operating system itself.

Otherwise the addresses you see are not the "real" addresses on the RAM, the operating system translates them to real addresses and if there is not a map to translate your virtual address to a real one, then you can get a segmentation fault. Keep in mind that there are other reasons that can cause a segmentation fault.

LtWorf
  • 7,286
  • 6
  • 31
  • 45
  • The question does not ask to access memory by physical address. The assembly code it shows will access memory by virtual address, when built into a program in the normal ways. The question asks for C code to do the same thing. Therefore, it is asking for C code to access memory by virtual address. – Eric Postpischil Mar 26 '13 at 13:45
  • Why would you want to access a specific virtual address? It doesn't make any sense. I think the OP just doesn't know about virtual memory so asked the wrong question. – LtWorf Mar 26 '13 at 14:32
  • 2
    Because you are going to write a linker or loader and need to construct addresses from scratch. Because you are learning by exploring the address space. Because you are preparing code that will eventually be in the kernel but want to try it out in user space for convenience first. I do not know. But I know what question was asked, and I answered it. If you think the question is not useful or a different question was intended, you can comment on the question and ask the poser of the question. – Eric Postpischil Mar 26 '13 at 15:28
  • @EricPostpischil: Easier way to "explore virtual address space": `less /proc/self/maps` on Linux, to see all the mappings for a process. – Peter Cordes Dec 14 '16 at 20:43
  • Also, you can assign a numeric value to a pointer variable and read that, if you want to read a given memory address in C, it's not difficult. – LtWorf Dec 15 '16 at 14:31