1

Suppose you allocate some array arr size n as follows:

int arr[n]; // Allocated fine
printf("%d", arr[n]); // Segfault if possible

Is there such a number n that exists whereby I can always trigger a segfault on the printf line? This could be specific to some OS.

I know its undefined behavior, and I know when accessing it and changing it out of bounds will affect another area of memory that will (likely) cause me major problems later on.

My professor said that it will not always segfault, and I'm curious if there's anyway to create an array of some size in some situation with some type of OS or computer that will reliably segfault every time.

Is this possible or no? Is there some condition I can create that will result in a single out of bound access to always trigger a segfault.

Is it theoretically possible to be always true? But just won't happen in practice all the time?

Water
  • 3,245
  • 3
  • 28
  • 58
  • 1
    No this is not possible – M.M Jan 19 '16 at 23:38
  • 1
    @M.M if you can say why I'd be more than glad to accept this as the answer. – Water Jan 19 '16 at 23:38
  • If arr is on the stack, then on many architectures it will *never* segfault. – Oliver Charlesworth Jan 19 '16 at 23:40
  • Not with `int arr[n]`, but it is possible [with some trickery] to do it if it allocated as in `int *arr = funky_alloc(n)`. Are you interested in that? – Craig Estey Jan 19 '16 at 23:42
  • The best practice is to lookout for out of bound errors yourself instead of letting it happens at runtime. Anyway it would really have unexpected behavior and depends on the system and also on the compiler you are using – Pooya Jan 19 '16 at 23:45
  • Accessing an array out-of bounds is **undefined** behaviour. Segfault is a courtesy of your run-time environment resp. OS. So, ther can also very well appear nasal daemons. – too honest for this site Jan 19 '16 at 23:45
  • @PooyaSaeedi: The term is _undefined behaviour_. And it exactly describes hat can happen. – too honest for this site Jan 19 '16 at 23:46
  • `*(int *)NULL` is more likely to segfault – Déjà vu Jan 19 '16 at 23:50
  • Hmmm `((char *)NULL)[0]` looks like a "1 over the array size" that can seg fault. May even be reliable on select OS's. Its UB per C spec. You need an OS spec that says so. – chux - Reinstate Monica Jan 19 '16 at 23:52
  • @ringø: Not necessarily. The compiler might very well optimise that away. – too honest for this site Jan 19 '16 at 23:52
  • 1
    Setup a DOS-box, program the x86 MMU appropriately, e.g. excluding addresses 0..4095 and run some program accessing that page. But what does that prove? – too honest for this site Jan 19 '16 at 23:54
  • @CraigEstey Yes, I'd be interested in that. – Water Jan 20 '16 at 00:45
  • @Olaf that proves that there is a condition whereby you can always cause a segfault, whereas I was told such a thing is impossible anywhere. While this is what I was told, I am actually curious if such a thing is truly the case... is what I'm being told *actually* true, or am I getting misinformation? – Water Jan 20 '16 at 00:47
  • @Water: It actually is not possible on a normal, running system. Because there is just too much going on. Such a setup was completely artificial and does not prove anything else than the hardware is working. And that is very specific. IOW: it is useless for normal usage on a normal and operating desktop- or server system. – too honest for this site Jan 20 '16 at 01:00

4 Answers4

6

In the general case, as Ben notes, it is undefined behaviour. The general answer is "don't rely on undefined behaviour ever, and it's effects are never deterministic".

There are, however, two sure fire ways to cause this on specific, modern, run-of-the-mill systems, which covers a large cross-section of modern PCs, but it's not portable across all compilers, architectures, operating systems, etc.

  1. Just create an array and align it to the stack boundary. Try accessing element arr[-1], or align it to the other extreme. Not guaranteed, but very likely, since the OS won't allow you to access protected memory, or if you're writing to an RODATA segment, that's that.
  2. On Linux, just compile your code with the -fstack-protector-strong, and watch your code deliberately crash when you stack smash. It's a good idea to enable this on test builds of your software during code coverage tests: better to crash in the testing phase and fix it, then to deploy it and have it crash in production.
Cloud
  • 18,753
  • 15
  • 79
  • 153
  • 1
    I have about 20 different systems here that will not segfault. Most of them because they don't have a MMU. PC-people always forget there are much, much more other CPUs than x86 ;-) – too honest for this site Jan 19 '16 at 23:56
  • @Olaf D'oh! Well, gonna keep the answer up. Your comment supplements it nicely. – Cloud Jan 20 '16 at 00:04
  • Hmm, as you claim they are "sure fire ways", but also state they are not portable (there is no "a bit portable"). So I think you should exactly define the conditions they are "sure ...". Both statements cannot be true at the same time. – too honest for this site Jan 20 '16 at 00:11
  • @Olaf "PC-people always forget there are much, much more other CPUs than x86" --> Been using a PC for 30 years. Programing embedded for decades too. Also mainframe, 68xxx, graphics processors, Ancient 30-bit processors. So do not agree with _always_. – chux - Reinstate Monica Jan 20 '16 at 00:11
  • @chux: You are an insignificant deviantion from the mean - Much like me :-). Just that in the late 70ies and 80ies Desktop computers were less powerful than current MCUs (including a lot of 8/16 bit MCUs). A lot of the desktops even had much less RAM (hell, adding Flash, my old Atari-ST had less usable memory than the the MCU of my current project has on-chip). (I still love the 680x0 Assembler). – too honest for this site Jan 20 '16 at 00:16
4

No. Out of bounds access is undefined behavior. UB means anything can happen. On a standard system you can usually find a way that will consistently cause a segfault but in no way is this guaranteed. If you change something else in the code, maybe you will get binary shift changing your stack allocation and changing the result of the program.

As an example, on a PowerPC 5200 (Not a great MMU) running RTEMS 4.9.2, the following code does not create a segfault:

int arr[5];
arr[6] = 10;

In fact even this doesnt create a segfault:

int *p = 0;
while (true)
   *(p--) = 666;

Really, undefined means Undefined.

To do it in a print statement you can do things like

printf("%d", arr[n]) // Out of bounds access
printf("%f", arr[n]) // wrong type access

But i will re-iterate, while this might seg-fault for you in a specific circumstance repeatably, in no way is it guaranteed to always happen that way.

To reliably stop a POSIX system with a SIGSEGV, your best bet is to raise it yourself:

raise(SIGSEGV);

More information about forcing a SIGSEGV signal can be found here: How to programmatically cause a core dump in C/C++

and here:

C++ Creating a SIGSEGV for debug purposes

Community
  • 1
  • 1
Fantastic Mr Fox
  • 32,495
  • 27
  • 95
  • 175
  • 1
    OP: `I'm curious if there's anyway to create an array of some size in some situation with some type of OS or computer that will reliably segfault every time.` <- of course this is possible. – Ctx Jan 19 '16 at 23:46
  • 1
    @Ctx, absolutely not. I am yet to find a reliable way to consistently crash the system i mentioned in my answer. You cannot say there is a reliable way to crash any computer. – Fantastic Mr Fox Jan 19 '16 at 23:48
  • 2
    @Ctx: If you define system state complete (which you can), of course. But what is it worth, as it might behave differently once you change few bits? – too honest for this site Jan 19 '16 at 23:49
  • @Ben "some situation, some OS, some computer" means a given single constellation, not for _all_ constellation, i.e. linux-3.x on x86. – Ctx Jan 19 '16 at 23:49
  • @Ben: OP does not ask about "any" computer, but a specific system and a well-defined system state. That would exclude background progresses, etc., of course. But I don't see much sense in that, too. – too honest for this site Jan 19 '16 at 23:50
  • @Ctx Yes i agree, but i still dont think you can make any *guarantee* that it will fail every time. Reliable, sure, but by throwing a SIGSEGV from your program you can absolutely guarantee it. – Fantastic Mr Fox Jan 19 '16 at 23:51
  • 1
    @Ben On linux i386 with gcc, a char[1] array of on the stack (which is about somewhere around 0xbfxxxxxx) access element 0x10000000 will be in 0xcfxxxxxxxx, which is kernel memory and thus protected from userspace. Guaranteed to segfault. – Ctx Jan 19 '16 at 23:53
  • @Ben: Just set up a DOS-box, program the MMU yourself and run appropriate code. Et voila! – too honest for this site Jan 19 '16 at 23:58
  • 2
    @Ctx: I somehow doubt that works on a 32 bit Linux as you stated. – too honest for this site Jan 19 '16 at 23:59
  • 1
    @Ben: Please read my comments carefully. While I disagree with your claim it is not possible to setup such a scenario, I don't see much use in it either. The only thing such a setup proves is the MMU and interrupt/CPU event handler does work properly. And I doubt somehow OP writes test-code for CPUs. – too honest for this site Jan 20 '16 at 00:07
  • 1
    @Olaf, i never meant to claim that it is impossible to cause a system to segfault over and over. I think i am right in saying that you are never guaranteed that this will be the behavior regardless of other parts in the program. – Fantastic Mr Fox Jan 20 '16 at 00:14
2

Caveat: This uses an array from a malloc, so technically, it's not quite the same.

But, this will add a "guard" page/area at the end, which always causes a segfault.

I've often used this to debug "off-by-one" array indexing. I've found it to be so useful, that I've added it as part of a malloc wrapper in my production code.

So, if the intent is to come up with something that debugs a real problem, this may help:

// segvforce -- force a segfault on going over bounds

#include <stdio.h>
#include <fcntl.h>
#include <errno.h>
#include <string.h>
#include <stdlib.h>
#include <sys/mman.h>

#ifndef PAGESIZE
#define PAGESIZE        4096
#endif

// number of guard pages that follow
// NOTE: for simple increments, one is usually sufficient, but for larger
// increments or more complex indexing, choose a larger value
#ifndef GUARDCNT
#define GUARDCNT        1
#endif

#define GUARDSIZE       (PAGESIZE * GUARDCNT)

// crash_alloc -- allocate for overbound protection
void *
crash_alloc(size_t curlen)
{
    size_t pagelen;
    void *base;
    void *endp;

    pagelen = curlen;

    // align up to page size
    pagelen += PAGESIZE - 1;
    pagelen /= PAGESIZE;
    pagelen *= PAGESIZE;

    // add space for guard pages
    pagelen += GUARDSIZE * 2;

    base = NULL;
    posix_memalign(&base,PAGESIZE,pagelen);
    printf("base: %p\n",base);

    // point to end of area
    endp = base + pagelen;
    printf("endp: %p\n",endp);

    // back up to guard page and protect it
    endp -= GUARDSIZE;
    printf("prot: %p\n",endp);
    mprotect(endp,GUARDSIZE,PROT_NONE);

    // point to area for caller
    endp -= curlen;
    printf("fini: %p\n",endp);

    return endp;
}

// main -- main program
int
main(int argc,char **argv)
{
    int n;
    int *arr;
    int idx;
    int val;

    n = 3;
    arr = crash_alloc(sizeof(int) * n);

    val = 0;
    for (idx = 0;  idx <= n;  ++idx) {
        printf("try: %d\n",idx);
        val += arr[idx];
    }

    printf("finish\n");

    return val;
}
Craig Estey
  • 30,627
  • 4
  • 24
  • 48
1

As others have noted, you cannot ensure a segmentation fault in the general case and you can try, with an elaborate allocation method, to make it more systematic on some systems.

There is a better way to debug your code and detect this kind of error: there is a very efficient tool just for that: valgrind. Check if it is available for your environment.

chqrlie
  • 131,814
  • 10
  • 121
  • 189