0

I'm using NanoPB to send an encoded data (array of unsigned char) from a server to a client. I'm mapping each byte as a single char, concatenate them, and next, send as a whole string through a network. In the client side, I have a serial interface that can read the server's response with getc or gets. Problem is the buffer might have null-terminating chars and gets would fail. For example, suppose buffer contains something like this:

unsigned char buffer[] = {72, 101, 108, 108, 111, 32, 119, 111, 114, 108, 100, 0, 24, 1, 32, 1, 40, 0};

For simplicity, I wrote the buffer to a file and trying to read it back and reconstruct it (with help of this):

#include <stdio.h>

void strInput(FILE *fp, char str[], int nchars) {
    int i = 0;
    int ch;
    while ((ch = fgetc(fp)) != '\n' && ch != EOF) {
        if (i < nchars) {
            str[i++] = ch;
        }
    }
    str[i] = '\0';
}

void readChars(FILE *fp)
{
    char c = fgetc(fp);
    while (c != EOF)
    {
        printf("%c", c);
        c = fgetc(fp);
    }
}


int main() {
    FILE *fp;
    const char* filepath = "mybuffer.txt";
    char c;
    char buffer[100];

    fp = fopen(filepath, "r+");    
    strInput(fp, buffer, sizeof(buffer));
    printf("Reading with strInput (WRONG): %s\r\n", buffer);
    fclose(fp);

    fp = fopen(filepath, "r+");
    printf("Reading char by char: ");
    readChars(fp);
    printf("\r\n");
    fclose(fp);

    getchar();
    return 0;
}

And here is the output:

Reading with strInput (WRONG): Hello world
Reading char by char: Hello world  (

How can I reconstruct the buffer from that file? Why readChars print all the buffer but strInput not?

Community
  • 1
  • 1
Masoud Rahimi
  • 5,785
  • 15
  • 39
  • 67
  • What is your question? Note that 1) [`gets()` should never be used under any circumstances](https://stackoverflow.com/questions/1694036/why-is-the-gets-function-so-dangerous-that-it-should-not-be-used), and 2) both `gets()` and `fgets()` read input (including null bytes) until a newline is found. – ad absurdum Dec 12 '18 at 05:13
  • Not sure what to tell you. Many standard C functions assume `'\0'` is the terminator. If you data contains this character, you'll need to avoid those functions, writing your own functions if an appropriate one is not available. – Jonathan Wood Dec 12 '18 at 05:13
  • 1
    If the buffer contains NUL terminators, it can't be printed with `printf`. And you can't read it with `gets` either. Of course, you should [never use `gets`](https://stackoverflow.com/questions/1694036) anyways. You should read a byte at a time with `fgetc`, or read the whole message with `fread`. – user3386109 Dec 12 '18 at 05:14
  • @JonathanWood Are you saying there is no way to read a file that contains null chars? – Masoud Rahimi Dec 12 '18 at 05:18
  • @user3386109 Actually I'm reading it throw a serial interface that has only `getc` and `gets` function, is there anyway I read the chars one by one and recreate the buffer? – Masoud Rahimi Dec 12 '18 at 05:19
  • Of course there is a way. You didn't have a direct question in your post originally. You'll need to read it as a binary file. – Jonathan Wood Dec 12 '18 at 05:19
  • 1
    @MasoudR. Yes, you can use `getc` to read one byte at a time. But you need to know how many bytes to read. This can be accomplished by sending the length first. – user3386109 Dec 12 '18 at 05:22
  • @JonathanWood I can't because it is not actually in a file, it is a serial interface that return server respond (as string) char by char. Can you look at [this](https://os.mbed.com/docs/latest/apis/serial.html). – Masoud Rahimi Dec 12 '18 at 05:22
  • @MasoudR.: Well, after you edited your post to have an actual question, your question is how to reconstruct the buffer from a file. So perhaps you should work on how you write your questions. – Jonathan Wood Dec 12 '18 at 05:24
  • @user3386109 -- both `fgets()` and `gets()` will read null bytes; they both read input until a `\n` is encountered. – ad absurdum Dec 12 '18 at 05:25
  • @JonathanWood -- no need for binary files; null bytes are read by `gets()` and `fgets()`. – ad absurdum Dec 12 '18 at 05:27
  • 1
    "return server respond (as string) char by char" This is NOT correct. If it is not NUL-terminated, it is not a string. You do not have strings, therefore you should not treat them as strings. – Gerhardh Dec 12 '18 at 07:18
  • @Gerhardh -- note that OP appears to have strings. A `char` array with embedded null bytes does contain a string, yet maybe not the string expected. OP sample is a null-terminated array that also contains an embedded null byte. There is a string at the start of the buffer that ends with the first `\0`, and a second string starting with the character following the first `\0` that ends with the final `\0`. Treating the whole buffer as a string may lead to surprising results, as in fact happened here. – ad absurdum Dec 12 '18 at 07:54
  • @DavidBowling Agreed, you can view it that way. But the OP talks about strings with embedded NUL-characters. Not about having multiple strings. I would assume the embedded NUL bytes are a result of the encoding of the data. It seems as if binary data is simply sent as text. Nevertheless, two strings are not one string. So don't treat them as a string. The correct way would be to handle it as buffers of binary data. – Gerhardh Dec 12 '18 at 08:07

1 Answers1

2

"Why readChars print all the buffer but strInput not?"

The readChars() function actually prints all characters as they are read, one at a time, in the function:

while (c != EOF)
    {
        printf("%c", c);
        c = fgetc(fp);
    }

But, the strInput() function prints the contents of buffer[] as a string using the %s conversion specifier:

strInput(fp, buffer, sizeof(buffer));
printf("Reading with strInput (WRONG): %s\r\n", buffer);

Printing stops when an embedded \0 character is encountered this time, because that is what %s does.

Note that c in the readChars() function should be an int, not a char. The fgetc() function returns an int value, and EOF may not be representable in a char.

If you want to see the embedded null bytes, print the characters from buffer[] one at a time:

#include <stdio.h>

int main(void)
{
    FILE *fp = fopen("mybuffer.txt", "r");  // should check for open failure

    char buffer[100] = { '\0' };  // zero to avoid UB when printing all chars
    fgets(buffer, sizeof buffer, fp);
//  could just as well use:
//  strInput(fp, buffer, sizeof(buffer));

    for (size_t i = 0; i < sizeof buffer; i++) {
        if (buffer[i] == '\0') {
            putchar('*');        // some character not expected in input
        }
        else {
            putchar(buffer[i]);
        }
    }
    putchar('\n');

    return 0;
}
ad absurdum
  • 19,498
  • 5
  • 37
  • 60