17

This is an unusual question to ask but here goes:

In my code, I accidentally dereference NULL somewhere. But instead of the application crashing with a segfault, it seems to stop execution of the current function and just return control back to the UI. This makes debugging difficult because I would normally like to be alerted to the crash so I can attach a debugger.

What could be causing this?

Specifically, my code is an ODBC Driver (ie. a DLL). My test application is ODBC Test (odbct32w.exe) which allows me to explicitly call the ODBC API functions in my DLL. When I call one of the functions which has a known segfault, instead of crashing the application, ODBC Test simply returns control to the UI without printing the result of the function call. I can then call any function in my driver again.

I do know that technically the application calls the ODBC driver manager which loads and calls the functions in my driver. But that is beside the point as my segfault (or whatever is happening) causes the driver manager function to not return either (as evidenced by the application not printing a result).

One of my co-workers with a similar machine experiences this same problem while another does not but we have not been able to determine any specific differences.

Trevor
  • 719
  • 1
  • 6
  • 15
  • 18
    I love the title of this question... but I can't *possibly* ship it without bugs!! –  Jan 07 '11 at 19:43
  • 2
    You seem to have provided everything, except the code. Perhaps the compiler is optimizing away the code that does nothing. – Ian Boyd Jan 07 '11 at 19:44
  • 3
    I suppose the exception is being handled somewhere? – EboMike Jan 07 '11 at 19:44
  • I'm with EboMike, something will be catching the AV – David Heffernan Jan 07 '11 at 19:45
  • 1
    Depending on the context - normally any such action would result in a page-fault error, since you cannot read 0x00000000 (or nearby). This throw would propagate up the stack to whatever handler takes it. No handler -> back to the OS/runtime system which generally speaking would unload the offending .exe and terminate the process. There are tons of opportunities in that chain to intercept that behavior, not knowing your context, can't say what exactly may be interceding on your behalf. – Mordachai Jan 07 '11 at 19:47
  • Ian: I can attach a debugger and step through the code up to the point of where the segfault should happen. When I step the next line it simply returns control back to the application. – Trevor Jan 07 '11 at 19:48
  • EboMike: This is C/C++. A segfault shouldn't be turned into a handleable exception other than raising a signal. And this application doesn't handle signals as far as I can tell. Like I said, the same application on one of my co-workers machines will crash as expected. – Trevor Jan 07 '11 at 19:55
  • @Trevor: You might want to try using windbg instead of visual studio's debugger. It's not always easy to see exceptions in visual studio's debugger. – Larry Osterman Jan 08 '11 at 06:20
  • @Trevor: this is Windows. An access violation is catcheable via Structured Exception Handling (SEH), an OS-level mechanism. – Seva Alekseyev Apr 20 '11 at 02:57

5 Answers5

44

Windows has non-portable language extensions (known as "SEH") which allow you to catch page faults and segmentation violations as exceptions.

There are parts of the OS libraries (particularly inside the OS code that processes some window messages, if I remember correctly) which have a __try block and will make your code continue to run even in the face of such catastrophic errors. Likely you are being called inside one of these __try blocks. Sad but true.

Check out this blog post, for example: The case of the disappearing OnLoad exception – user-mode callback exceptions in x64

Update:

I find it kind of weird the kind of ideas that are being attributed to me in the comments. For the record:

  • I did not claim that SEH itself is bad.

    I said that it is "non-portable", which is true. I also claimed that using SEH to ignore STATUS_ACCESS_VIOLATION in user mode code is "sad". I stand by this. I should hope that I had the nerve to do this in new code and you were reviewing my code that you would yell at me, just as if I wrote catch (...) { /* Ignore this! */ }. It's a bad idea. It's especially bad for access violation because getting an AV typically means your process is in a bad state, and you shouldn't continue execution.

  • I did not argue that the existence of SEH means that you must swallow all errors.

    Of course SEH is a general mechanism and not to blame for every idiotic use of it. What I said was that some Windows binaries swallow STATUS_ACCESS_VIOLATION when calling into a function pointer, a true and observable fact, and that this is less than pretty. Note that they may have historical reasons or extenuating circumstances to justify this. Hence "sad but true."

  • I did not inject any "Windows vs. Unix" rhetoric here. A bad idea is a bad idea on any platform. Trying to recover from SIGSEGV on a Unix-type OS would be equally sketchy.

asveikau
  • 39,039
  • 2
  • 53
  • 68
  • 2
    I wouldn't call SEH as language extension, I think of it more as a service provided by the OS. I'm not really sure about window message callbacks - what are they? Do you mean WNDPROCs? Although Windows is not the same as UNIX, that fact alone does not make it inferior. – David Heffernan Jan 07 '11 at 19:50
  • @David Heffernan - Yes I mean `WNDPROC` s and I am not saying anything about whether or not the notion of a `WNDPROC` makes something inferior or superior, only that it is a bad idea to use SEH to catch `STATUS_ACCESS_VIOLATION`. Catching catastrophic behavior, not killing the process, and pretending nothing happened is categorically evil. – asveikau Jan 07 '11 at 19:53
  • That seems reasonable but doesn't explain why the code would crash on one machine but hide the crash on another. – Trevor Jan 07 '11 at 19:57
  • 1
    @Trevor - I might suggest looking at the program in the debugger on each machine and inspecting the stack to see if it is really the same thing going on. In Windbg ("Debugging tools for Windows", a free download from MSFT) you might have to use the command "`sxe`" which breaks on exceptions even if they are caught. – asveikau Jan 07 '11 at 20:02
  • 1
    @asveikau SEH is usually used by tool vendors to implement C++ exceptions, or indeed other languages with exceptions. SEH doesn't swallow exceptions, it raises and transports them. If you call certain Win32 API functions, and SEH is active, and you pass in NULL pointers, then you can raise AVs in the Windows DLLs, I sometimes see this happen in kernel32 (which is not actually the kernel just to confuse matters!) If these exceptions return to my app and remain unhandled, then, yes, my app will terminate. – David Heffernan Jan 07 '11 at 20:24
  • 1
    @asveikau My point essentially is that on Windows, with SEH, it is perfectly normal for an access violation to result in an exception which if, unhandled, will terminate the app. If you choose, in your apps to swallow the SEH exceptions silently, then that is your mistake and not a design flaw in Windows. – David Heffernan Jan 07 '11 at 20:25
  • There is nothing wrong with SEH, anymore than there is a problem with signals. It is just an OS mechanism for handling OS-level exceptions. They predate C++ (Win32 was originally written in C, which has no exceptions). The only "language extensions" is that they allow a C/C++ app to hook into the OS-level SEH transport and trap such exceptions just like a C++ exception. It is extremely useful IF you know what you're doing. – Mordachai Jan 07 '11 at 21:01
  • @David Heffernan - I suggest you re-read my post, possibly several times. I feel you are putting words in my mouth. I have edited the post to clarify what I think you have misread. – asveikau Jan 07 '11 at 22:55
  • @asveikau It's alright, I think I got it the first time round! And I still don't see SEH as a language extension. Which language is being extended? Last time I checked, referencing invalid pointers was UB. Or did I get that wrong? – David Heffernan Jan 07 '11 at 23:04
  • 1
    @David Heffernan - I would call `__try` an extension in C. C does not have exceptions, AFAIK `__try` is implemented in the compiler rather than via macro magic, and last I knew GCC does not support `__try`. – asveikau Jan 07 '11 at 23:48
  • @asveikau __try is a language extension but it is not the same thing as SEH – David Heffernan Jan 08 '11 at 00:02
  • @Mordachai: Theoretically you're right that there's nothing wrong with SEH. But in practice it's extraordinarily hard to get SEH right. That's why Microsoft has banned the use of SEH internally with some fairly limited exceptions (kernel mode code probing user mode addresses, RPC/COM). – Larry Osterman Jan 08 '11 at 06:22
  • I'm accepting this as the answer. I was able to use a debugger to see the SEH exception. I still can't figure out why the application is hiding the exception for me and not my co-worker, but I can't investigate that any further since I don't have the source to that. – Trevor Jan 09 '11 at 18:40
  • @Larry - Interesting. I have never had difficulties making SEH work with our C++ applications. But then I only use it to trap exceptions outside of the C++ space (the true SEH's thrown by CPU hardware, such as floating point errors, stack overflow, etc.). Then again, why would you use it for anything else? – Mordachai Jan 10 '11 at 15:04
  • @Mordachai: The problem is that trapping exceptions can lead to either reliability problems (you catch the exception and you turn a crash into a memory corruption or deadlock) or security holes. Just about the only thing you should do in an exception handler is terminate the process and given that the OS will do that for you (and save a crash dump for you to get later), it's typically better to just let the OS handle them. – Larry Osterman Jan 11 '11 at 06:39
  • @Larry - I respectfully disagree. I have, for example, written a stack overflow handler that correctly traps, unwinds the stack, and restores the stack w/o endangering anything. It is no trivial code, and it is totally OS-dependent. But it is a valid use of SEH. Similarly, I have written a generic SEH trap that produces a mini-dump of the execution state before proceeding to crash out to the OS. This too is massively useful in getting back severe error diagnostics from very hard to reproduce problems in the field. – Mordachai Jan 11 '11 at 18:35
  • @Mordachai: I've talked to the guys who own the windows error diagnostic stuff in the past - getting exception handling right is *staggeringly* hard. In particular running code in a process where you can make no assumptions about the state of the process is quite challenging. And that's the state you're in when you deal with an access violation exception. – Larry Osterman Jan 19 '11 at 19:45
  • @Larry: it may well be staggeringly hard, yet if the circumstances are correct, it is doable. Our application is staggeringly hard to crash, much to the pleasure of our users. Each action is essentially attempted, and if things go horribly wrong, then the action is discarded without affecting the rest of the system's stability, at least in theory. I'm not advocating that your average programmer should run around messing with SEH, but for our product it has worked very well. – Mordachai Jan 19 '11 at 20:21
  • @Mordachai: Realistically the only safe thing to do from a handler is to freeze all the current threads in the process and invoke an external debugger. Any attempts at inspecting the state of the current process from an exception filter or handler is likely to fail. That's why writing an exception handler in an MSFT product automatically guarantees that shipping your app will be blocked on extensive external reviews. – Larry Osterman Jan 20 '11 at 02:03
  • @Larry: hogwash. Just because one thread encounters an infinite loop due to the exact conditions encountered by the user's data doesn't invalidate the state of other threads (unless they're serving that one). Even then, it's quite possible to design your software such that unwinding one thread terminates (cleanly) the subordinate threads. And since C++ has excellent stack-unwinding capabilities, you can cleanly come back to a stable starting point, and allow the user to change the conditions and retry from there. – Mordachai Jan 20 '11 at 16:09
  • Do not forget about FTH - Fault Tollerant Heap - See my post below! It IS your problem on Windows 7! – Петър Петров Apr 25 '12 at 12:44
  • On whether SEH is an OS service or a language extension - in practice it's both. There's an API which hypothetically you could use without the language extension, which was described [here](http://www.microsoft.com/msj/0197/Exception/Exception.aspx). It's also complex, "underdocumented" (at least when that was written in 1997) and dependent on some details of how the compiler generates machine code. Using it directly even requires some inline assembler, so no-one is realistically going to do that. In practice, what people use is the syntactic sugar - the language extension. –  Nov 23 '13 at 09:38
19

Dereferencing NULL pointer is an undefined behavior, which can produce almost anything -- a seg.fault, a letter to IRS, or a post to stackoverflow :)

Gene Bushuyev
  • 5,512
  • 20
  • 19
  • 12
    Is that in ascending order of badness? :) – EboMike Jan 07 '11 at 19:55
  • 3
    This is 100% correct - but not really an answer to this specific question. – Mordachai Jan 07 '11 at 20:57
  • It actually is. "Undefined behavior" means it can crash on a machine with operating system X but doesn't crash on another machine with operating system Y. In fact, ANY change (different compiler, different platform, different configuration, different time of day) could yield different results. – EboMike Jan 07 '11 at 22:30
  • 1
    @EboMike or Descending order of Madness – Newtopian Mar 25 '15 at 19:29
3

Windows 7 also have its Fault Tollerant Heap (FTH) which sometimes does such things. In my case it was also a NULL-dereference. If you develop on Windows 7 you really want to turn it off!

What is Windows 7's Fault Tolerant Heap?

http://msdn.microsoft.com/en-us/library/dd744764%28v=vs.85%29.aspx

Community
  • 1
  • 1
Петър Петров
  • 1,966
  • 1
  • 17
  • 14
2

Read about the different kinds of exception handlers here -- they don't catch the same kind of exceptions.

user541686
  • 205,094
  • 128
  • 528
  • 886
1

Attach your debugger to all the apps that might call your dll, turn on the feature to break when an excption is thrown not just unhandled in the [debug]|[exceptions] menu.

ODBC is most (if not all) COM as such unhandled exceptions will cause issues, which could appear as exiting the ODBC function strangely or as bad as it hang and never return.

Greg Domjan
  • 13,943
  • 6
  • 43
  • 59
  • 100% agreed - what's probably happening is that some code higher in the stack is catching the exception. The way to find this is to start your app under a debugger (or attach a debugger to your app) and enable catching first chance exceptions. It should hit fairly easily. – Larry Osterman Jan 08 '11 at 06:19