1

With support for macOS, Windows (MSVC), and Linux, how do I do the following?

char *s;
func(&s, "foo");
if (<condition>) func(&s, "bar%s", "can")
/* want "foobarcan", and I don't know `strlen(s)` AoT */

I've tried with asprintf (was able to find an MSVC implementation) but that didn't seem to work well on this kind of workflow. fopencookie and funopen seem convenient but unavailable on MSVC.

Maybe there's some clean way with realloc to create a NUL ended char* in C?

A T
  • 13,008
  • 21
  • 97
  • 158
  • You cannot pass `char *s` and `realloc()` within a function without returning and overwriting the original address held by the pointer. (either pass the address of `s`, e.g. `char **s`, or change the function return type to `char *` and return a realloc'ed `s`. Otherwise you must know the length of `s` to avoid writing beyond the bounds of `s`. You cannot call `realloc()` on `s` for the initial allocation unless `s` is initialized `NULL`. – David C. Rankin Mar 23 '22 at 05:05
  • Use `vsnprintf` with `n=0` to determine how long the output string is going to be. Then `malloc` at least `length+1` bytes of memory. Call `vsnprintf` again with the correct value of `n` to create the string. Finally, `free` the previous string, if any. BTW, you need to initialize `s`, e.g. `char *s = NULL;` – user3386109 Mar 23 '22 at 05:19
  • `func(&s, "foo")` and `func(&s, "bar%s", "can")` -- C does not allow function-overloading like C++, so presumably you mean `func1(...)` and `func2(...)`? – David C. Rankin Mar 23 '22 at 05:58
  • David: yes. Was just using this setup to match `asprintf`, it doesn't have to be these types (and can initialise to `NULL` or 0). user3386109 ok so that's the cleanest way? – A T Mar 23 '22 at 14:02
  • @AT That's the cleanest way to figure out how much memory you need. Note that I would use `malloc` to allocate the buffer, and not `realloc`. That allows you to do things like append the existing string to itself, e.g. `char *s = NULL; func(&s, "foo"); func(&s, "%s", s);` would create the string "foofoo". – user3386109 Mar 23 '22 at 18:43

1 Answers1

3

As pointed out in the comments, (v)snprintf always returns the number of bytes that would have been written (excluding the null terminating byte), even if truncated. This has the effect that providing the function with a size argument of 0 returns the length of the to-be-formatted string.

Using this value, plus the string length of our existing string (if applicable), plus one, we (re)allocate the appropriate amount of memory.

To concatenate, simply print the formatted string at the correct offset.

An example, sans error checking.

#include <stdarg.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char *dstr(char **unto, const char *fmt, ...) {
    va_list args;
    size_t base_length = unto && *unto ? strlen(*unto) : 0;

    va_start(args, fmt);
                /* check length for failure */
    int length = vsnprintf(NULL, 0, fmt, args);
    va_end(args);

                /* check result for failure */
    char *result = realloc(unto ? *unto : NULL, base_length + length + 1);

    va_start(args, fmt);
                /* check for failure*/
    vsprintf(result + base_length, fmt, args);
    va_end(args);

    if (unto)
        *unto = result;

    return result;
}

int main(void) {
    char *s = dstr(NULL, "foo");

    dstr(&s, "bar%s%d", "can", 7);

    printf("[[%s]]\n", s);

    free(s);
}

stdout:

[[foobarcan7]]

The caveat here is that you can not write:

char *s;
dstr(&s, "foo");

s must be initialized as NULL, or the function must be used directly as an initializer, with the first argument set to NULL.

That, and the second argument is always treated as a format string. Use other means of preallocating the first string if it contains unsanitary data.

Example exploit:

/* exploit */
char buf[128];
fgets(buf, sizeof buf, stdin);

char *str = dstr(NULL, buf);
puts(str);
free(str);

stdin:

%d%s%s%s%s%d%p%dpapdpasd%d%.2f%p%d

Result: Undefined Behavior

Oka
  • 23,367
  • 6
  • 42
  • 53
  • 1
    Also worth noting that `s` cannot be `NULL` if the string passed in `fmt` contains a conversion specifier for it (e.g. `"%s%s"`). That gets a bit deeper as you need a way to parse `fmt` to determine the number of conversion specifiers present and then match that to the non-NULL arguments given. (corner-case) – David C. Rankin Mar 23 '22 at 06:20
  • @DavidC.Rankin Sorry, I'm not sure I entirely follow. Could you clarify *"`fmt` contains a conversion specifier for it"*? An example call that invokes this described behaviour would help, I think. – Oka Mar 23 '22 at 06:41
  • Sure, lets say `str` is `NULL` and you to want to combine it with `"foo"` in `func (&s, "%s%s", "foo")` to end up with `"foo"`. In that case your `vsprintf` with `fmt` of `"%s%s"` would need to include both `*s` and `"foo"`, but `*s` is `NULL`. The resulting string would be `"(null)foo"`. So the question showing `func(&s, "foo");` and `func(&s, "bar%s", "can")` to end up with `"foobarcan"` presents a challenge. If you limit a single arg like the first case to a default `"%s"` to get `"foo"` you are fine, but then how to combine that with `"%scan"` to get `"foobarcan"` -- that's a pickle. – David C. Rankin Mar 23 '22 at 07:06
  • As I understand the question, the first argument of the function does not correlate to any part of the format string. Or rather, a leading `%s` is somewhat implied, if applicable. The desired operation is akin to `concat(s ? s : calloc(INFINITY, 1), format(fmt, ...args))` (wherein we pretend `format` mimics `sprintf`, but returns a buffer, and `s` always has enough room for the result; ignoring deallocations). `"%s%s"` should always be followed by two varargs; UB otherwise. Maybe I've misunderstood, though. – Oka Mar 23 '22 at 07:43
  • One issue I have spotted is that `dstr(&s, "hello %s", s)` will of course explode, as `s` will have been invalidated before writing. This should be solvable with a separate allocation instead of a reallocation. – Oka Mar 23 '22 at 07:47
  • Thanks, it mostly works, though sometimes I get a `AddressSanitizer: attempting free on address which was not malloc()-ed` on the `realloc` line, even with this simple test: `char *s;dstr(&s, "foo%s", "bar");dstr(&s, "can%s", "haz");ASSERT_EQ(strcmp(s, "foofbarcanhaz"), 0);free(s);PASS();` – A T Mar 23 '22 at 22:37
  • Oh it has to be set to `NULL`, gotcha, changed that now. +1 – A T Mar 23 '22 at 22:47