-6

I mean by the numerical value of the address itself, not by the value it points to. For example, if an address is 0x0, we surely know it is illegal,but if it is 0xffffeeee234560, how can I to tell it is normal or abnormal? Furthermore, how to know if this address belong to text segment, or data segment, or heap, or stack segment?

I have used pmap, cat /proc/id/smaps to see if there some clear rules, but can not get rational method, but only know heap address bigger than text segment,and stack address higher than heap.

imp25
  • 2,327
  • 16
  • 23
basketballnewbie
  • 1,713
  • 3
  • 15
  • 18
  • 2
    Out of curiosity, why are you wanting to know this? – Dai Jan 05 '14 at 04:14
  • 7
    You can't tell except by using intensely platform specific knowledge, and the information may not be available outside the lowest levels of the kernel. – Jonathan Leffler Jan 05 '14 at 04:19
  • 2
    Depending upon your application, you might find this experience with identifying illegal pointers in Windows interesting: http://blogs.msdn.com/b/oldnewthing/archive/2006/09/27/773741.aspx – Ron Burk Jan 05 '14 at 04:31
  • 1
    You might want to look at [temporarilily disabling ASLR](http://askubuntu.com/questions/318315/how-can-i-temporarily-disable-aslr-address-space-layout-randomization) while you investigate. – ldav1s Jan 05 '14 at 04:43
  • 3
    **Why do you ask**? Is it for debugging purposes, or because you want to play wild tricks in your own memory allocator? – Basile Starynkevitch Jan 05 '14 at 07:22
  • 1
    every one discuss this question in quite depth. i just want to know:if some obvious feature of an address is present,then i can judge this address is illegal,mostly,e.g,the address is too high or to low,and help me to debug a program easier. – basketballnewbie Jan 05 '14 at 08:04
  • 1
    What do you mean by "judging an address"? In the debugger by yourself, on in the process by your program? Why do want to "judge an address"? What is your original concern? And addresses don't have "features"!! They are just 64 bits abstract values. – Basile Starynkevitch Jan 05 '14 at 08:08
  • You cannot without a lot of hassle. Just use either addresses returned by 'new', 'malloc' and the '&' operator to avoid any problems in the first place. – Ed Heal Jan 05 '14 at 08:20
  • 2
    Please **explain your overall concern** and **what motivates your question** (is it debugging your program or implementing your memory allocator); we are guessing differently the motivation of your question and its context. Please *edit your question* to improve it. – Basile Starynkevitch Jan 05 '14 at 08:35

4 Answers4

2

If you are debugging your program (compiled with gcc -Wall -g), the gdb debugger will tell you if some address is illegal. Use also valgrind and the address sanitizer of GCC 4.8 (gcc -fsanitize=address)...

If you want inside your program to know if some particular address is or not in its address space (of a process running your program) and in which segment it is, you could make a routine parsing /proc/self/maps to do that. Reading that file (or other files from proc(5) ...) is really fast (since such files don't exist on disk).

There is usually not a single text segment, or a single data segment, or a single stack segment (think of multi-threaded applications and dynamic linking) in a given process. There are several segments in your address space (the address space of a process running your program), which are usually "randomly" laid out by the kernel because of ASLR. (ASLR can be disabled system-wide)

However, if address values matter to your application at runtime (sometimes it is interesting to encode some type information in the address, i.e. to allocate pairs in one segment, triplets in another, and larger objects elsewhere), you should take the opposite approach: explicitly manage by yourself large memory segments (e.g. aligned to a megabyte) with mmap(2) and munmap(2) (which are called by posix_memalign(3)...) and when you reserve your segments, register them in appropriate containers (e.g. a std::map in C++). Then you would easily code a routine which, given some arbitrary address (any void*), gets your segment containing it (or else nullptr). Don't forget that malloc can be internally used by many library routines (including printf and C++ standard containers...). So malloc is always used even without you knowing how. You may be interested by mallinfo(3), malloc_info(3), mallopt(3). Read also the C dynamic memory allocation & memory management wikipages, and study if needed the source code of malloc (the one inside MUSL libc is easy to read).

Consider reading Advanced Linux Programming; it should help you.

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
1

Regardless of whether it's absolutely illegal, it's illegal for any particular part of your code to use unless you know, as a logical consequence of how you obtained that addresses, that it's valid and points to an object you can legally modify.

In short, if you have to ask, it's illegal.

R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
-1

For an user process running in Linux 32-bits, any virtual address from 0xc0000000 upwards is illegal, because it belongs to the kernel. For Windows 32-bits without the /3G boot switch, any virtual address equal or above 0x80000000 is illegal for the same reasons (if /3G is used and the process executable file has the LARGE_ADDRESS flag enabled, the boundary is at the same address as in Linux)

Without leaving Linux 32-bits, if you are able to walk through the page directory of your own process, you can take any virtual address ans see if it is mapped to somewhere within an actual physical page, and if so, which permissions does that page have.

Beware! as memory allocated with malloc() is of course legal, but if you try to check its legality by peeling the directory page, you will find that almost all the allocated memory block appears as "not present", hence you would infer that the address is illegal. When a memory access is performed to a valid address within the block, and that address belongs to a still not present page, a page fault is triggered, but it does not become a segfault, but the kernel silently allocates actual memory to map the page which the accessed address belongs to, and reissues the instruction that first caused the page fault.

With a suitable device driver, a process can know about its own memory map and walk through its page directory. Of course, this is heavily implementation specific. See here for details.

Community
  • 1
  • 1
mcleod_ideafix
  • 11,128
  • 2
  • 24
  • 32
  • 1
    An address could belong to some segment and be `free`-d.... – Basile Starynkevitch Jan 05 '14 at 07:20
  • At least in 32-bit Linux, an address that belongs to a block allocated with `malloc()`, then `free()`-ed, becomes illegal because the kernel demaps it from physical memory, i.e. the process memory map marks the page that address belongs to as "not present" (hence, illegal). – mcleod_ideafix Jan 05 '14 at 07:23
  • No, you are completely wrong: all the `malloc` implementations that I looked in are reusing small `free`-d zones, and they stay in still `mmap`-ed heap segments. `malloc` tries hard to avoid `mmap` and `munmap` by managing previously `free`-d zones (to give them thru future `malloc`-s). – Basile Starynkevitch Jan 05 '14 at 07:26
  • While doing some experiments with the code that generates this memory map ( http://stackoverflow.com/questions/20792158/is-kernel-space-mapped-into-user-space-on-linux-x86/20792205#20792205 ) I could see that right after free(), nearly all the allocated block become "not present" memory. – mcleod_ideafix Jan 05 '14 at 07:29
  • If you `malloc` and `free` (in *random* order) small memory zones (e.g. of 80 bytes each) they usually stay in `mmap`-ed segment after `free` (which would not call `munmap`...). – Basile Starynkevitch Jan 05 '14 at 07:36
  • Well. That makes sense. My experiments were using 4MB (or bigger) blocks. – mcleod_ideafix Jan 05 '14 at 07:52
  • BTW: in my answer I'm not speaking about recently free()-d blocks, but freshly malloc()-ed blocks that haven't been actually used yet, so why the downvote? Which is wrong with it? – mcleod_ideafix Jan 05 '14 at 07:55
-2

You could write a program in assembly that stores some values in the heap, stack, text and data segments, then take the addresses of those values and prints the addresses.

This would only be valid for that single program, however, and might not be consistent across different runs of the program. If an address is between the addresses of the first and last stack items, it should be legal, same with between the first and last text items, first and last data items, and so on.

LogicChains
  • 4,332
  • 2
  • 18
  • 27
  • This doesn't really answer the question as written. At best, it works only if the address being examined happens to fall within one of the three ranges you give, but other addresses could be valid in a particular execution at different times. – Jim Garrison Jan 05 '14 at 04:41
  • I should have been clearer; by 'and so on', I meant that it worked for all the ranges described (heap, stack, text, data). Since those ranges (and bss, which could be treated similarly) usually cover the entire amount of memory allocated to a program, if they were all known then it could be said that any address outside of them was not valid. And of course, it's only valid at the time the addresses are measured; if the program state changes and more/less stack/heap is used, then obviously the ranges would need to be recalculated if possible. – LogicChains Jan 05 '14 at 04:49
  • No need to use *assembly* for that. – Basile Starynkevitch Jan 05 '14 at 07:00
  • True, but using assembly requires less familiarity with C, and it's easier to see exactly where data is placed. – LogicChains Jan 05 '14 at 08:22