11

Cppcheck has detected a potential problem in a code like this:

float a, b, c;
int count = sscanf(data, "%f,%f,%f", &a, &b, &c);

It says that: "scanf without field width limits can crash with huge data". How is that possible? Is that a known bug in some sscanf implementations? I understand that the numbers may overflow (numerically), but how could the program crash? Is that a false positive in cppcheck?

I have found a similar question: scanf Cppcheck warning, but the answer is not completely satisfying. The answer mentions type safety, but that should not be an issue here.

Community
  • 1
  • 1
Juraj Blaho
  • 13,301
  • 7
  • 50
  • 96
  • Try sscanf_s instead. As normal scanf, sscanf is not overflow safe. – guitarflow Feb 15 '12 at 11:57
  • 2
    @guitarflow: The problem is that I don't see where it may overflow. – Juraj Blaho Feb 15 '12 at 12:03
  • 2
    @guitarflow Or don’t. `sscanf_s` isn’t portable and also not actually safe, despite what the name suggests and Microsoft claims. – Konrad Rudolph Feb 15 '12 at 12:07
  • http://en.wikipedia.org/wiki/Format_string_attack is also important to pay attention to. Buffer overflows alone aren't the only vulnerabilities in scans. If you allow the user to input the format string they can use %x to print arbitrary memory locations and %n to write them. Among other things. – synthesizerpatel Feb 15 '12 at 12:09
  • 2
    @synthesizerpatel: As you can see, format is a string literal here, so that is not a problem. – Juraj Blaho Feb 15 '12 at 12:12
  • Yes, in this particular case perhaps. But, thats the problem with static code analysis - you get a lot of false positives. I try to avoid scanf just because whoever inherits my code might not know what the boundaries of safety are. But, thats just me. – synthesizerpatel Feb 15 '12 at 12:20
  • @KonradRudolph Really? I knew that it isn't portable but I didn't know about the potential danger. What makes it unsafe? – guitarflow Feb 15 '12 at 12:38
  • @JurajBlaho Check this out http://crasseux.com/books/ctutorial/sscanf.html#sscanf – guitarflow Feb 15 '12 at 12:39
  • @guitarflow Well it presupposes that you already know the buffer size. but this is usually the issue in the first place. `sscanf_s` doesn’t actually check (and cannot check) whether the buffer size is correct. So it protects only insofar as it makes the buffer size explicit. A far superior method is preventing buffer overflows in the first place, and C++ makes this trivial. (Also, at least one of the “safe” commands – but I don’t remember which – had a buffer overflow bug. Oh the irony.) – Konrad Rudolph Feb 15 '12 at 13:52

3 Answers3

7

I am a Cppcheck developer.

Yes this is a weird crash. With "huge data" it means millions of digits.

If you use the --verbose flag then cppcheck will actually write a little example code that usually crashes on linux computers.

Here is an example code that crashes with a segmentation fault on my Ubuntu 11.10 computer:

#include <stdio.h>

#define HUGE_SIZE 100000000

int main()
{
    int i;
    char *data = new char[HUGE_SIZE];
    for (int i = 0; i < HUGE_SIZE; ++i)
        data[i] = '1';
    data[HUGE_SIZE-1] = 0;
    sscanf(data, "%i", &i);
    delete [] data;
    return 0;
}

For your info I don't get a crash when I try this example code on visual studio.

I used g++ version 4.6.1 to compile.

Daniel Marjamäki
  • 2,907
  • 15
  • 16
  • 1
    The question remains. Why does it crash? I don't see any reason when the code to parse the number could be something like: `for each digit in data: result*=10; result+=digit`. How could that crash? Why is it not fixed? – Juraj Blaho Feb 15 '12 at 14:58
  • I primarily wanted to answer "Is that a false positive in Cppcheck?". It is a weird crash so it's easy to think so. I can't answer why technically it crashes. It has been a known and widespread problem for years. I agree that with your code it can't crash so obviously that is not how the data is parsed. – Daniel Marjamäki Feb 15 '12 at 21:44
  • Yes, I understand. Thanks at least for a partial answer. I gave you +1. – Juraj Blaho Feb 15 '12 at 22:33
4

The segmentation fault seems to be a bug in glibc.

I've just tested this with a similar program, which crashes in ubuntu 10.04, but works in ubuntu 12.04.

As Daniel Marjamäki said, his program crashes in 11.10, I believe the bug is fixed in between.

1

OK, consider this code:

int main(int argc, char *argv[]) {
    const char* data = "9999999999999999999999999.9999999999999999999999//i put alot more 9's there, this just to get the point through
    float a;
    int count = sscanf(data, "%f", &a);
    printf("%f",a);
}

the output of this program is "inf" - no crash. And I put a huge amounts of 9's there. So I suspect Cppcheck is just plain wrong about this.

WeaselFox
  • 7,220
  • 8
  • 44
  • 75