4

Lets say that I have a library which runs 24x7 on certain machines. Even if the code is rock solid, a hardware fault can sooner or later trigger an exception. I would like to have some sort of failsafe in position for events like this. One approach would be to write wrapper functions that encapsulate each api a:

returnCode=DEFAULT;
try
{
  returnCode=libraryAPI1();
 }
catch(...)
{
 returnCode=BAD;
}
return returnCode;

The caller of the library then restarts the whole thread, reinitializes the module if the returnCode is bad.

Things CAN go horribly wrong. E.g.

if the try block(or libraryAPI1()) had:

 func1();
 char *x=malloc(1000);
 func2();

if func2() throws an exception, x will never be freed. On a similar vein, file corruption is a possible outcome.

Could you please tell me what other things can possibly go wrong in this scenario?

Sridhar Iyer
  • 2,772
  • 1
  • 21
  • 28
  • 4
    Regarding the example, thats exactly the kind of problems [RAII](http://en.wikipedia.org/wiki/Resource_Acquisition_Is_Initialization) solves. – Georg Fritzsche May 20 '10 at 02:59
  • 1
    My "solution" to a similar situation (a daemon running 24/7) is simply to have a cron job that checks every 2 minutes to make sure the daemon is still running, and restart it if necessary. When a rare exception is thrown, it's often not a big deal to just let the daemon die and restart a couple minutes later. – Joey Adams May 20 '10 at 08:13
  • 2
    if you're down to "a hardware fault might cause this code to break", then what makes you think it's possible to write code to fix it? Your recovery code would be affected too. – jalf May 20 '10 at 11:40
  • Read this: http://stackoverflow.com/questions/161177/does-c-support-finally-blocks-and-whats-this-raii-i-keep-hearing-about/161247 to understand why the finally is a bad language design that is solved by RAII – Martin York May 20 '10 at 16:25

4 Answers4

4

This code:

func1();
char *x=malloc(1000);
func2();

Is not C++ code. This is what people refer to as C with classes. It is a style of program that looks like C++ but does not match up to how C++ is used in real life. The reason is; good exception safe C++ code practically never requires the use of pointer (directly) in code as pointers are always contained inside a class specifically designed to manage their lifespan in an exception safe manor (Usually smart pointers or containers).

The C++ equivalent of that code is:

func1();
std::vector<char> x(1000);
func2();
Martin York
  • 257,169
  • 86
  • 333
  • 562
3

A hardware failure may not lead to a C++ exception. On some systems, hardware exceptions are a completely different mechanism than C++ exceptions. On others, C++ exceptions are built on top of the hardware exception mechanism. So this isn't really a general design question.

If you want to be able to recover, you need to be transactional--each state change needs to run to completion or be backed out completely. RAII is one part of that. As Chris Becke points out in another answer, there's more to state than resource acquisition.

There's a copy-modify-swap idiom that's used a lot for transactions, but that might be way too heavy if you're trying to adapt working code to handle this one-in-a-million case.

If you truly need robustness, then isolate the code into a process. If a hardware fault kills the process, you can have a watchdog restart it. The OS will reclaim the lost resources. Your code would only need to worry about being transactional with persistent state, like stuff saved to files.

Adrian McCarthy
  • 45,555
  • 16
  • 123
  • 175
2

Do you have control over libraryAPI implementation ?

If it can fit into OO model, you need to design it using RAII pattern, which guarantees the destructor (who will release acquired resources) to be invoked on exception.

usage of resource-manage-helper such as smart pointer do help too

try
{
    someNormalFunction();
    cSmartPtr<BYTE> pBuf = malloc(1000);
    someExceptionThrowingFunction();    
}
catch(...)
{
    // Do logging and other necessary actions
    // but no cleaning required for <pBuf>
}
YeenFei
  • 3,180
  • 18
  • 26
  • yes, I do have the source code. My question is not how to fix the example, but what other problems I might encounter. Your answer is helpful in case I choose to use this wrapper and would refactor. – Sridhar Iyer May 20 '10 at 04:08
  • As I understand, you are implementing a software "watchdog" in your system. While you can recover from most exceptions, there are cases where you cant pretend things never happened and continue running, namely stack corruption :) – YeenFei May 20 '10 at 07:52
  • You should **always** use RAII for resource management in C++. Guard resources within a stack-allocate object whose destructor performs the cleanup. Then you can pretty much remove all the try/catch clauses. – jalf May 20 '10 at 11:40
  • you will need certain fault-containment (try/catch) after RAII implementation. – YeenFei May 21 '10 at 00:53
2

The problem with exeptions is - even if you do re-engineer with RAiI - its still easy to make code that becomes desynchronized:

void SomeClass::SomeMethod()
{
  this->stateA++;
  SomeOtherMethod();
  this->stateB++;
}

Now, the example might look artifical, but if you substitue stateA++ and stateB++ for operations that change the state of the class in some way, the expected outcome of this class is for the states to remain in sync. RAII might solve some of the problems associated with state when using exceptions, but all it does is provide a false sense of security - If SomeOtherMethod() throws an exception ALL the surrounding code needs to be analyzed to ensure that the post conditions (stateA.delta == stateB.delta) are met.

Chris Becke
  • 34,244
  • 12
  • 79
  • 148
  • RAII provide ONLY fail-safe handling for resources. It does not protect a operation/process from falling apart. – YeenFei May 21 '10 at 00:55
  • RAII techniques can be used to easily solve this problem, if you use a generous definition of *resource*. – Ben Voigt May 21 '10 at 22:57