29

OWASP says:

"C library functions such as strcpy (), strcat (), sprintf () and vsprintf () operate on null terminated strings and perform no bounds checking."

sprintf writes formatted data to string int sprintf ( char * str, const char * format, ... );

Example:

sprintf(str, "%s", message); // assume declaration and 
                             // initialization of variables

If I understand OWASP's comment, then the dangers of using sprintf are that

1) if message's length > str's length, there's a buffer overflow

and

2) if message does not null-terminate with \0, then message could get copied into str beyond the memory address of message, causing a buffer overflow

Please confirm/deny. Thanks

Kevin Meredith
  • 41,036
  • 63
  • 209
  • 384
  • 3
    Also, programmer error such as `sprintf(str, message)` or similar is a real risk. – You Sep 07 '10 at 21:50

8 Answers8

27

You're correct on both problems, though they're really both the same problem (which is accessing data beyond the boundaries of an array).

A solution to your first problem is to instead use std::snprintf, which accepts a buffer size as an argument.

A solution to your second problem is to give a maximum length argument to snprintf. For example:

char buffer[128];

std::snprintf(buffer, sizeof(buffer), "This is a %.4s\n", "testGARBAGE DATA");

// std::strcmp(buffer, "This is a test\n") == 0

If you want to store the entire string (e.g. in the case sizeof(buffer) is too small), run snprintf twice:

int length = std::snprintf(nullptr, 0, "This is a %.4s\n", "testGARBAGE DATA");

++length;           // +1 for null terminator
char *buffer = new char[length];

std::snprintf(buffer, length, "This is a %.4s\n", "testGARBAGE DATA");

(You can probably fit this into a function using va or variadic templates.)

Pharap
  • 3,826
  • 5
  • 37
  • 51
strager
  • 88,763
  • 26
  • 134
  • 176
  • 5
    How are they "both the same problem" when `snprintf` only solves the first one? – Rob Kennedy Sep 07 '10 at 21:35
  • @Rob Kennedy, They're the same problem in different places. Sorry that wasn't clear. You're right; `snprintf` blindly replacing `sprintf` only fixes the first one. – strager Sep 07 '10 at 21:40
  • snprintf will also truncate the arguments passed, solving number 2 (if message is not NULL terminated, only the first N characters will be written). – dash-tom-bang Sep 07 '10 at 21:45
  • @dash-tom-bang, While this may be true, it's still possibly treading into the land of undefined behaviour. – strager Sep 07 '10 at 21:51
  • But @Dash, if N is huge, and `message` is supposed to be short but lacks its null terminator, then N's huge value doesn't protect you from the access violation you'll get from reading too far, or from revealing the contents of memory that were supposed to stay secret. – Rob Kennedy Sep 07 '10 at 21:57
  • Perhaps the solution then is to keep N small if message is supposed to be short? :) – dash-tom-bang Sep 07 '10 at 22:00
  • This use of `snprintf` is a popular misconception. If the string is too long, it will simply get truncated. Who said that the truncated string is acceptable in this case? Generally, it is not. So this is just a way to replace one problem with another. – AnT stands with Russia Sep 07 '10 at 22:06
  • The solution presented by `snprintf` is based on the fact that one can call `snprintf` with null arguments first, obtain the exact length of buffer necessary to store the result, make sure that the sufficiently-sized buffer is allocated and then use that buffer to store the result. – AnT stands with Russia Sep 07 '10 at 22:08
  • @AndreyT, You are correct in that it may replace one problem with another. However, the initial problem was a *programming error* leading to *undefined behaviour*, which is **always** bad (unless you're cynical). On the other hand, truncating the string won't cause an access violation. (I didn't see your second comment, but) I added an example of the technique you describe in my answer. – strager Sep 07 '10 at 23:03
11

Both of your assertions are correct.

There's an additional problem not mentioned. There is no type checking on the parameters. If you mismatch the format string and the parameters, undefined and undesirable behavior could result. For example:

char buf[1024] = {0};
float f = 42.0f;
sprintf(buf, "%s", f);  // `f` isn't a string.  the sun may explode here

This can be particularly nasty to debug.

All of the above lead many C++ developers to the conclusion that you should never use sprintf and its brethren. Indeed, there are facilities you can use to avoid all of the above problems. One, streams, is built right in to the language:

#include <sstream>
#include <string>

// ...

float f = 42.0f;

stringstream ss;
ss << f;
string s = ss.str();

...and another popular choice for those who, like me, still prefer to use sprintf comes from the boost Format libraries:

#include <string>
#include <boost\format.hpp>

// ...

float f = 42.0f;
string s = (boost::format("%1%") %f).str();

Should you adopt the "never use sprintf" mantra? Decide for yourself. There's usually a best tool for the job and depending on what you're doing, sprintf just might be it.

John Dibling
  • 99,718
  • 31
  • 186
  • 324
  • 1
    GCC (and I believe many others) perform soft typechecking. Also, the path separator in `#include` is `/` no matter what platform. – Potatoswatter Sep 07 '10 at 23:56
  • @Potatoswatter: The interpretation of any character (including `\ ` and `/`) in `#include` is left to the implementation. The Standard doesn't even claim there's a file system, let alone paths with path separators. – MSalters Sep 08 '10 at 09:05
  • @MSalters: Yes, but essentially every implementation supporting pathnames will support `/` for compatibility with existing code and textbooks, and not nearly so much with `\ `. Perhaps more importantly, having `\u` in there may produce a universal-character-name. For that matter, a platform can very reasonably decide to substitute escape sequences instead of or in addition to using as a separator. So, what's the advantage? – Potatoswatter Sep 08 '10 at 15:54
4

Yes, it is mostly a matter of buffer overflows. However, those are quite serious business nowdays, since buffer overflows are the prime attack vector used by system crackers to circumvent software or system security. If you expose something like this to user input, there's a very good chance you are handing the keys to your program (or even your computer itself) to the crackers.

From OWASP's perspective, let's pretend we are writing a web server, and we use sprintf to parse the input that a browser passes us.

Now let's suppose someone malicious out there passes our web browser a string far larger than will fit in the buffer we chose. His extra data will instead overwrite nearby data. If he makes it large enough, some of his data will get copied over the webserver's instructions rather than its data. Now he can get our webserver to execute his code.

T.E.D.
  • 44,016
  • 10
  • 73
  • 134
3

Your 2 numbered conclusions are correct, but incomplete.

There is an additional risk:

char* format = 0;
char buf[128];
sprintf(buf, format, "hello");

Here, format is not NULL-terminated. sprintf() doesn't check that either.

MSalters
  • 173,980
  • 10
  • 155
  • 350
1

Your interpretation seems to be correct. However, your case #2 isn't really a buffer overflow. It's more of a memory access violation. That's just terminology though, it's still a major problem.

bta
  • 43,959
  • 6
  • 69
  • 99
1

The sprintf function, when used with certain format specifiers, poses two types of security risk: (1) writing memory it shouldn't; (2) reading memory it shouldn't. If snprintf is used with a size parameter that matches the buffer, it won't write anything it shouldn't. Depending upon the parameters, it may still read stuff it shouldn't. Depending upon the operating environment and what else a program is doing, the danger from improper reads may or may not be less severe than that from improper writes.

supercat
  • 77,689
  • 9
  • 166
  • 211
1

It is very important to remember that sprintf() adds the ASCII 0 character as string terminator at the end of each string. Therefore, the destination buffer must have at least n+1 bytes (To print the word "HELLO", a 6-byte buffer is required, NOT 5)

In the example below, it may not be obvious, but in the 2-byte destination buffer, the second byte will be overwritten by ASCII 0 character. If only 1 byte was allocated for the buffer, this would cause buffer overrun.

char buf[3] = {'1', '2'};
int n = sprintf(buf, "A");

Also note that the return value of sprintf() does NOT include the null-terminating character. In the example above, 2 bytes were written, but the function returns '1'.

In the example below, the first byte of class member variable 'i' would be partially overwritten by sprintf() (on a 32-bit system).

struct S
{
    char buf[4];
    int i;
};


int main()
{
    struct S s = { };
    s.i = 12345;

    int num = sprintf(s.buf, "ABCD");
    // The value of s.i is NOT 12345 anymore !

    return 0;
}
Community
  • 1
  • 1
Jacob N.
  • 9
  • 2
-1

I pretty much have stated a small example how you could get rid of the buffer size declaration for the sprintf (if you intended to, of course!) and no snprintf envolved ....

Note: This is an APPEND/CONCATENATION example, take a look at here

PYK
  • 3,674
  • 29
  • 17