6

In a C program I have a function that writes a message to a log file

LogResult writeLog(const char* format, ...)

That function passes its arguments on to 'vfprintf()' as a 'format' string and a 'va_list'. It just occurred to me that I don't have any control over what would happen if someone were to pass an unterminated string, e.g.

const char unterminatedString[5] = {'h', 'e', 'l', 'l', 'o'};
writeLog("Log message: %s", unterminatedString);

Is there any way to guard against this?

jokki
  • 251
  • 5
  • 13
  • 1
    Always add \0 yourself. Extra one won't hurt.. – Tony Hopkinson Feb 15 '14 at 22:03
  • 2
    There's no magic: if you want to do this, you have to pass each string's length to `writeLog`, and do the appropriate checking before calling `vfprintf()`, or build a new format string inside `writeLog` limiting the width of each of the printed strings to their sizes and then call `vfprintf` – Filipe Gonçalves Feb 15 '14 at 22:04
  • There are tons of errors like this. Some `%s` arguments might not be valid pointers at all. Argument values may be undefined/uninitialized. The format codes might not match the parameter list types. What makes this particular error different? –  Feb 15 '14 at 22:04
  • 2
    @TonyHopkinson How do you know where to insert it ? – Eric Fortin Feb 15 '14 at 22:06
  • 1
    Insert? did you mean append. Point taken though. – Tony Hopkinson Feb 15 '14 at 22:10
  • 3
    Since you tagged this with C++, I'm going to advice you to use `std::string`. – Zyx 2000 Feb 15 '14 at 22:11
  • 1
    @TonyHopkinson The whole point of the question is that you *precisely* don't know where the string ends. – Filipe Gonçalves Feb 15 '14 at 22:19
  • Thanks for your input everyone! Richard Chambers pointed me in the right direction: 'vsnprintf()', should resolve the issue. – jokki Feb 15 '14 at 22:29
  • Also if you use gcc take advantage of the printf format-checking options by adding the attribute `__attribute__ ((format (printf, 1, 2)))` to the declaration for writeLog. – Brandin Feb 15 '14 at 22:35

4 Answers4

2

There are many other ways (on top of the one that you've mentioned) for passing an illegal string.

A few examples:

char* unterminatedString;
writeLog("Log message: %s", unterminatedString);

char* unterminatedString = (char*)0xA5A5A5A5;
writeLog("Log message: %s", unterminatedString);

char unterminatedString[] = "abc";
unterminatedString[3] = 'd';
writeLog("Log message: %s", unterminatedString);

int x = -1;
char* unterminatedString = (char*)&x;
writeLog("Log message: %s", unterminatedString);

A legal string must start at a valid memory address, and end (with a 0 character) at a valid memory address. There is no definite way to assert that, except for iterating the string itself, until reaching a 0 character or performing a memory access violation (which is exactly what the vfprintf function does).

barak manos
  • 29,648
  • 10
  • 62
  • 114
  • 1
    I'd be interested in at least a rough sketch of a reduction to the halting problem, because I don't buy this at all. There's also the fact that plenty of languages have plenty of means of enforcing memory safety, though the relevance of this depends on your viewpoint. –  Feb 15 '14 at 22:13
  • I added a few examples of an illegal input string. I'd be interested in at least a rough sketch of enforcing memory safety in each one of these cases. – barak manos Feb 15 '14 at 22:33
  • Oh, I absolutely think it's not possible in the framework of standard C. But it's perfectly possible to make a memory safe language, by prohibiting operations that can create a string that's not properly terminated. The resulting language can be not only Turing complete but also very useful and practical (case in point: virtually every language with automatic memory management). You have not presented *any* evidence that solving this problem implies solving the halting problem (that, and nothing less, is a *reduction*). –  Feb 15 '14 at 22:38
  • In a native language such as C (**which is the tagged language in this question**), where the user has unrestricted access to memory, this is not feasible. If the target is a VM and not the CPU, then this is obviously feasible. – barak manos Feb 15 '14 at 22:41
  • Have a look at Rust then (and ignore the existence of `unsafe` for a second). Also, we're not arguing about whether it's possible in C or in system programming languages. I never believed it can be prevented in C, and never said so either. You claimed it's equivalent to the halting problem. **That** is what I'm challenging. –  Feb 15 '14 at 22:43
  • So to complete my answer - if you have a way for asserting that a given string ends with a 0 character at a valid memory address without iterating that string, then surely you can apply this magic and decide whether or not a machine halts on a given input program. – barak manos Feb 15 '14 at 22:44
  • So your whole argument is "this seems impossible, so surely any magic solution to it would also magically resolve this other problem which has an impossibility proof"? That's *extremely* weak, and not a reduction in any sense of the word. Color me disappointed. –  Feb 15 '14 at 22:46
  • No, it is definitely not **my whole argument**, and most certainly not a mathematical reduction by any means!!! It could be a nice exercise though, but I don't think that I'm gonna set my mind into it at this very moment :) – barak manos Feb 15 '14 at 22:48
  • Well, what *is* your argument then? For the claim that preventing invalid pointers is equivalent to solving the halting problem, not for the claim that doing so is possible in standard C. –  Feb 15 '14 at 22:51
  • My argument (not a **full** description of the reduction itself), is that it is impossible to tell whether or not a string ends with a 0 character at a valid memory address, without iterating the string and (possibly) attempting to access an illegal memory address. – barak manos Feb 15 '14 at 22:55
  • This thing about the halting problem is theoretically interesting, but wouldn't this situation require a TM with a finite tape (due to the C memory model). Then in the case of a C-style string, the worst possible case is that you look at each spot on the tape and fail to find '\0'. – Brandin Feb 15 '14 at 22:57
  • I think I might remove the section about the halting problem, and leave only the technical reference that I made afterwards. – barak manos Feb 15 '14 at 22:59
  • You should remove it, because I have seen nothing that supports it. Your argument suggests that both problems are undecidable, but doesn't even slightly hint at a relationship to the halting problem (or any other undecidable problem for that matter). Not all impossibilities are equivalent. As an example, the halting problem of a TM with an oracle for the usual halting problem (henceforth TM+oracle) is not decidable by TM+oracle, but not reducible to the usual halting problem (it wouldn't be undecidable otherwise, since TM+oracle *can* solve the usual halting problem. –  Feb 15 '14 at 23:03
2

const char* does not know the length of the array, it is just an address. So that is not simply possible.

If you want to stay in pure C, you could pass the length int len of the string as an extra argument and check if format[len]==0. I do not really think this is helpful though, it is even worse in my opinion.

Since you flagged this question with C++, you could (and imo should) use std::string:

LogResult writeLog(const std::string& format, ...)

An std::string is always correctly terminated, you can access its underlying C-style string with std::string::c_str().

Note that std::string can store \0-Characters, so functions using an C-style string would stop at the first \0, not necessarily at the end of the std::string.

Baum mit Augen
  • 49,044
  • 25
  • 144
  • 182
2

I assume your log function has its own internal buffer and you know the size of that.

You can then use the vsprintf() functions that specify a buffer size. See Possible buffer overflow vulnerability.

Community
  • 1
  • 1
Richard Chambers
  • 16,643
  • 4
  • 81
  • 106
  • You are very right, 'writeLog()' does have an internal buffer of known size and I believe that 'vsnprintf()' is the answer to my question. I'd somehow missed that one. – jokki Feb 15 '14 at 22:22
  • By the way I think if you pass an unterminated array you are still not guarded against UB. Consider if the unterminated array given to your writeLog function wanders off into the weeds before your buffer runs out of room. I think that is still UB, although perhaps this sort of UB is not nerely "as bad" as overrunning your buffer. – Brandin Feb 15 '14 at 22:39
  • @Brandin, my understanding of most buffer overflow is to prevent the buffer being written from being written past the end of the buffer. One possibility is that the format string is not terminated and happens to be in an area that happens to have proper format specifiers so that garbage arguments are read resulting in garbage in the log however the output buffer for the log will not overflow. C calling convention will return correctly. See http://stackoverflow.com/questions/13950642/why-does-a-function-with-no-parameters-compared-to-the-actual-function-definiti – Richard Chambers Feb 16 '14 at 01:49
1

No, there is no way to portably do it.

What you could do is look for this types of bugs using a tool like valgrind.

NPE
  • 486,780
  • 108
  • 951
  • 1,012