1

I have an object with the following method:

int PathSubstitution::updateField(Field4memo &field, int record_id, int field_id) const
{
    int type = field.type();
    if((type == r4str) || (type == r4memo) || (type == r4unicode))
    {
        string value = field.str();
        trim(value);
        if(!substituteDriveLetters(value))
            return -1;
        if(!substituteGridMount(value))
            return -1;

        return field.assign(value.c_str(), value.length());
    }
    return r4success;
}

When I build this code with my Debug profile in Visual Studio C++ 2010 everything works just fine. This method gets called 4 times, on four unique Field4memo objects, and it works.

When I build this code with my Release profile the method works the first time it's called, but causes Vista Enterprise to display a "program.exe has stopped working" dialog window. The "View problem details" area of the window says:

Problem signature:
  Problem Event Name:   BEX
  Application Name: program.exe
  Application Version:  0.0.0.0
  Application Timestamp:    4ef4edc6
  Fault Module Name:    program.exe
  Fault Module Version: 0.0.0.0
  Fault Module Timestamp:   4ef4edc6
  Exception Offset: 0000668a
  Exception Code:   c0000409
  Exception Data:   00000000
  OS Version:   6.0.6002.2.2.0.256.4
  Locale ID:    1033
  Additional Information 1: 6243
  Additional Information 2: 0d5daf38e26c963685a835e6f40ff03d
  Additional Information 3: aa53
  Additional Information 4: 5d02a603659cce53ff840117c3a9c7a7

The BEX event name indicates a buffer overflow. But which buffer, I cannot tell.

Here's where it gets weird for me though...

When I change this method and add an unnecessary cout line to it, it works with the Release profile:

int PathSubstitution::updateField(Field4memo &field, int record_id, int field_id) const
{
    int type = field.type();
    if((type == r4str) || (type == r4memo) || (type == r4unicode))
    {
        // THIS IS THE NEW LINE I ADDED RIGHT BELOW HERE!!!
        cout << endl;
        string value = field.str();
        trim(value);
        if(!substituteDriveLetters(value))
            return -1;
        if(!substituteGridMount(value))
            return -1;

        return field.assign(value.c_str(), value.length());
    }
    return r4success;
}

I can't tell why the method crashes with the Release profile or why adding the cout line resolves the crashing issue. I'm uncomfortable just accepting the "cout fixes it" answer -- can someone help me understand what my problem is here and why the cout fixes it? How does the cout call save me from a buffer overflow here?

Edit: some additional context for the call to this method was asked for. It's called in a loop. With the test input I'm using, it's called 4 times. The function that calls it looks like so:

int PathSubstitution::updateRecord(Data4 &dbf, int record_id) const
{
    // Update all fields
    int numFields = dbf.numFields();
    for(int i = 1; i <= numFields; i++ )
    {          
        Field4memo field(dbf, i);
        int rc = updateField(field, record_id, i);
        if(rc != r4success)
            return rc;
    }
    return r4success;
}

Edit 2: Flushing the cout buffer also fixes the overflow problem as long as cout.flush() is called from within the PathSubstitution::updateField method and before the return field.assign(value.c_str(), value.length()); line.

Edit 3: This is promising. If I comment out the calls the substituteDriveLetters() and substituteGridMount() methods the program doesn't crash. So it's something to do with those method calls (which use pcre to do some regular expression string substitutions).

Edit 4: If I comment out just the substituteDriveLetters() method it works. So I've got a prime suspect now. This method is supposed to replace a drive letter in a path with the corresponding UNC value. None of the fields in my test input are file paths so this should be a null op as far as data transformation is concerned.

bool PathSubstitution::substituteDriveLetters(string &str, string::size_type offset) const
{
    int offsets[6];
    int groups = pcre_exec(drivePattern, NULL, str.c_str(), str.size(), 0, 0, offsets, sizeof(offsets));
    if(groups < 0)
    {
        switch(groups)
        {
        case PCRE_ERROR_NOMATCH:
        case PCRE_ERROR_PARTIAL:
            return true;
        case PCRE_ERROR_NOMEMORY:
            cerr << "WARNING: Out of memory." << endl;
            break;
        case PCRE_ERROR_BADUTF8:
        case PCRE_ERROR_BADUTF8_OFFSET:
            cerr << "WARNING: Bad UNICODE string." << endl;
            break;
        default:
            cerr << "WARNING: Unable to substitute drive letters (Err: " << groups << ")" << endl;
            break;
        }
        return false;
    }

    char driveLetter = toupper(str[offsets[2]]);
    DriveMap::const_iterator i = driveMap.find(driveLetter);
    if(i == driveMap.end())
    {
        cerr << "ERROR: The " << driveLetter << " drive is not mapped to a network share." << endl;
        return false;
    }

    string::iterator start = str.begin() + offsets[0];
    string::iterator end = str.begin() + offsets[1];
    str.replace(start, end, i->second);

    return substituteDriveLetters(str, offsets[1]);
}
Ian C.
  • 3,783
  • 2
  • 24
  • 33
  • 1
    Perhaps because `endl` does more than just starting a new line, it flushes the buffer! – Tamer Shlash Dec 23 '11 at 21:25
  • Can you show the context in which this function is being called? Here's what my gut is telling me: You are calling this function many times per second in some loop. For some reason, after a certain number of iterations, you are getting the error. Debug mode, as well as the cout statement are slowing down your program dramatically, as they tend to do. Slower program means fewer iterations per second. It will probably still crash eventually, but you're not waiting long enough. Just a guess. – Benjamin Lindley Dec 23 '11 at 21:27
  • 1
    Maybe [Program only crashes as release build](http://stackoverflow.com/q/186237/237483) gives you some ideas. – Christian Ammer Dec 23 '11 at 21:30
  • @BenjaminLindley it is being called from a loop. FWIW this code has run fine for years under Windows < Vista Enterprise (primarily Win2k8 Server and WinXP). – Ian C. Dec 23 '11 at 21:33
  • @ChristianAmmer that's a good link. I'll chase this down assuming it's an array out of bounds error. Though, in which array, I'm not sure. Maybe `string value` isn't being GC'ed after the first call? – Ian C. Dec 23 '11 at 21:35
  • Does the code compile with 0 warnings? – Martin York Dec 23 '11 at 21:37
  • @IanC. GC'ed? Garbage Collected? That's not done in C++(generally). The string is destroyed at the closing brace of the if statement. – Benjamin Lindley Dec 23 '11 at 21:42
  • What's `Data4`, what's `Field4memo`? – CB Bailey Dec 23 '11 at 21:47
  • 1 through `numFields` inclusive? Not 0 through `numFields-1` inclusive? Is that really right? If it's not crashing in the debugger, then you could put in statements to check the validity of all inputs... or put in your own print statements to verify that they look like they should.... –  Dec 23 '11 at 21:51
  • @BenjaminLindley sorry, yea, garbage collection. It's been a while since I've cracked open any C++ – Ian C. Dec 23 '11 at 21:56
  • @CharlesBailey Data4 is file system based, database format. Field4memo is the record field object. I believe is based on an old FoxPro format. These are objects from a vendor-supplied library. – Ian C. Dec 23 '11 at 22:00
  • @Hurkyl yes, the index bounds on the loop are correct. The DB format is 1-indexed. In debug mode all the output is sane. I can put a cout that prints `value` right after it's assigned by `string value = field.str();` and I get a sensible string. – Ian C. Dec 23 '11 at 22:02
  • Are you sure that `pcre_exec` returns valid offsets for adding to `str.begin()`? Also, why take a parameter `offset` if you don't use it? – CB Bailey Dec 23 '11 at 22:35
  • @CharlesBailey I'm not and I think you're on to the problem. If I drop that pcre_exec() line and simply set `groups = -1` (which is what should happen for every field in every record in my example code, it works. – Ian C. Dec 23 '11 at 22:40

1 Answers1

2

Without a complete test case it's almost impossible to say what the exact problem is but given the behaviour, it is highly likely that your code has some form of undefined behavior and works in debug and with the extra cout statement through blind luck.

You should analyse the code and fix the underlying issue otherwise it's highly likely that a related bug will recur at the least convenient moment.

If you want help analysing the actual problem then you need to post a complete compilable example. At the moment we don't know anything about Field4memo, trim, substituteDriveLetters or substituteGridMount for starters.

Edit You may want to insert more checks for the string operations that you perform.

// need to check that offsets[2] >= 0 and < str.size()
char driveLetter = toupper(str[offsets[2]]);

// need to check that offsets[0] >= 0 and <= str.size()
string::iterator start = str.begin() + offsets[0];

// need to check that offsets[1] >= 0 and <= str.size()
string::iterator end = str.begin() + offsets[1];
CB Bailey
  • 755,051
  • 104
  • 632
  • 656
  • It's nearly impossible for me share test program I'm afraid. The input is protected IP and I have neither the means to create something for others to test it with, or the rights to share it. Sucks. I know. – Ian C. Dec 23 '11 at 21:33
  • 1
    @IanC.: In that case your question is doomed to be "incomplete". – CB Bailey Dec 23 '11 at 21:35