0

I am trying to write a native host client for a WebExtension browser addon in C++.

I looked for examples and unfortunately only found a couple ones.

There's this simple good looking one I'm following but I don't understand the part where a string is constructed as follows:

json get_message() {
    char raw_length[4];
    fread(raw_length, 4, sizeof(char), stdin);
    uint32_t message_length = *reinterpret_cast<uint32_t*>(raw_length);
    if(!message_length)
        exit(EXIT_SUCCESS);

    char message[message_length];
    fread(message, message_length, sizeof(char), stdin);
    string m( message, message + sizeof message / sizeof message[0] );
    return json::parse(m);
}

I looked at multiple documentations for C++ std::string constructors but I can't figure what is going on here exactly.

What does string m( message, message + sizeof message / sizeof message[0] ); do?

And why not just do string m(message)?

There is also the char message[message_length]; line which confused me because message_length is not a constant but apparently this is a GCC-specific feature.

Pouria P
  • 565
  • 1
  • 8
  • 21
  • 4
    `char message[message_length];` isn't standard C++. It uses VLA. – Jason May 11 '22 at 17:01
  • 1
    It's the same as `char *message = alloca(message_length)`, sort of. – Goswin von Brederlow May 11 '22 at 17:08
  • @GoswinvonBrederlow There's no alloca in C++ either though. – eerorika May 11 '22 at 17:09
  • I wasn't aware that non-standard features were banned in C++ questions. – user253751 May 11 '22 at 17:14
  • 2
    The code potentially leaks stack contents from previous functions because the message isn't initialized and the return value of fread isn't checked. An attacker can potentially trick you into returning your private ssl key in such a way that you would send it back to him in a response to the message. – Goswin von Brederlow May 11 '22 at 17:15
  • @GoswinvonBrederlow I plan to rewrite this to use the `new` keyword to declare `message`. That will solve this right? – Pouria P May 11 '22 at 17:17
  • raw_length lacks the necessary alignment to be used as `uint32_t*` and dereferencing it is UB. So I wouldn't call this code *good looking*. It's rather smelly. – Goswin von Brederlow May 11 '22 at 17:18
  • 2
    Unfortunately, this code is a pretty bad example. If you know the message size, you can just construct your string as "string m(size, '\0')" and use that as buffer. It unnecessarily fills the string with zeros but it is a negligible cost compared to fread. "sizeof message / sizeof message[0]" is the old C way of determining the size of an array. In C++ you use std::size for that. – Luppy May 11 '22 at 17:19
  • @GoswinvonBrederlow Thanks for pointing that out. I myself use good old bitwise operations for things like this. I thought maybe this is a more clever solution. – Pouria P May 11 '22 at 17:20
  • 1
    @PouriaP No, new will not initialize either. But that isn't a problem if you check the return value from fread you will know if the message was completely filled by the fread or not. PS: think about using std::vector message(message_length) and read into message.data() – Goswin von Brederlow May 11 '22 at 17:22
  • 1
    @PouriaP I would have gone with `uint32_t raw_length` and cast that to `void*` for the fread. And don't forget to check the return value there too. – Goswin von Brederlow May 11 '22 at 17:23
  • In general, place `new` close to the back of your tool box and [reach for it sparingly](https://stackoverflow.com/questions/6500313/why-should-c-programmers-minimize-use-of-new). – user4581301 May 11 '22 at 17:30
  • Very bad code. It has multiple issues. The only points are about json api: `` is the header you need. Everything is probably in `lohmann` namespace. class `json` has a `static json::parse(std::string const&)` member as well as a none-static `std::string json::dump(void)` member. Rest is garbage. – Red.Wave May 11 '22 at 18:00
  • Thanks @GoswinvonBrederlow and everyone else. I realize the arguments to the fread are wrong too and it doesn't return the bytes read. This code seems to be bad all over. I will look for better examples. – Pouria P May 12 '22 at 07:29

1 Answers1

2

What does string m( message, message + sizeof message / sizeof message[0] ); do?

message decays to pointer to first element, sizeof message / sizeof message[0] is the number of elements in the array, and message + sizeof message / sizeof message[0] is a pointer to the past the end of the array. Note that this is unnecessarily convoluted way to get the number of elements. I would recommend using message + message_length instead.

The constructor copies all characters from the range between these iterators (pointers). The iterators span the entire array.

What I don't understand is I can't seem to find a std::string constructor that accepts (char*, char*)

The constructor looks like this (simplified):

template<class InputIt>
string(InputIt first, InputIt last);

The template parameter is deduced to be char* from the arguments.

And why not just do string m(message)?

That constructor requires that message is terminated by null character. If the argument isn't terminated by null character, then the behaviour of the program would be undefined. That would be bad.

Also, if the array does contain null terminator characters, then that constructor would not copy those into the string, while the iterator range constructor does. That would be bad if the intention is to copy the null terminators.

There is also the char message[message_length]; line which confused me because message_length is not a constant

This isn't allowed in C++. The program is ill-formed.


Since the example is rather poor quality, here is a version that fixes major issues:

json get_message(std::istream& in)
{
    std::uint32_t message_length;
    // NOTE input is assumed to be in native byte order
    in.read(reinterpret_cast<char*>(&message_length), sizeof message_length);
    if (!in)
        throw std::runtime_error("Invalid message length");
    if (!message_length)
        throw std::runtime_error("No input");

    std::string m('\0', message_length);
    in.read(m.data(), message_length);
    if (!in)
        throw std::runtime_error("Invalid message");
    return json::parse(m);
}

You can adjust if you want to use another error handling instead of exceptions. The demonstrated error handling has room for improvement (unique exception classes etc.).

Most significant deviation from the original example is that the function can read from any input stream; not only standard input. I've chosen to do so not only because it's more general, but also because there's actually no standard+portable way to read binary from standard input. See C read binary stdin

eerorika
  • 232,697
  • 12
  • 197
  • 326
  • 1
    > It copes message + sizeof message / sizeof message[0] characters from the array message into the string. I believe it actually refers to the range constructor, not number of characters. – The Dreams Wind May 11 '22 at 17:02
  • @TheDreamsWind Oh yea, I misread. Fixed. – eerorika May 11 '22 at 17:03
  • Saying VLAs aren't allowed seems nebulous. They are allowed, as code compiles when they're used. They are not Standard C++, but compilers support them as extensions. They are also not strictly forbidden AFAIK. They are definitely a bad practice that should be fixed. Just nitpicking a bit. – sweenish May 11 '22 at 17:05
  • It also handles the case where the `message` buffer contains embedded 0s which are part of the message and need to be copied into the string. – Richard Critten May 11 '22 at 17:05
  • @sweenish `They are allowed, as code compiles ` Code compiling doesn't mean that it is allowed in C++. It is strictly forbidden in C++. It may be allowed by a language extension. – eerorika May 11 '22 at 17:06
  • @RichardCritten Good point. I added that to the answer. – eerorika May 11 '22 at 17:07
  • I would argue that's exactly what it means. If it gets past syntax checking, it's allowed. I think my definition of the term is just looser than yours. In any case, I agree that it shouldn't be used. – sweenish May 11 '22 at 17:08
  • Thanks. I realize that `message + sizeof message / sizeof message[0]` is a pointer to past the end of the array. What I don't understand is I can't seem to find a `std::string` constructor that accepts `(char*, char*)`. So how does this even work? – Pouria P May 11 '22 at 17:13
  • A better way would be to fix the fread (its arguments are wrong) and use the return value. – user18533375 May 11 '22 at 18:10
  • @user18533375 A better way to do what? There's much room for improvement in the function in general. – eerorika May 11 '22 at 18:11