-1

I've got a little problem while experimenting with some C code. I've tried to use read()-Command to read a text out of a file and store the results in a charArray. But when I print the results they're always different from the file.

Here is the code:

#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>

void main() {

int fd = open("file", 2);
char buf[2];
printf("Read elements: %ld\n", read(fd, buf, 2));
printf("%s\n", buf);
close(fd);

}

The file "file" was created in the same directory using the following UNIX commands:

cat > file
Hi

So it contains just the word "Hi". When I run it, I expect it to read 2 bytes from the file (which are 'H' and 'i') and store them at buf[0] and buf[1]. But when I want to print the result, it appears, that there was an issue, because besides the word "Hi" there are several wierd characters printed (indicating a memory reading/writing problem i guess, due to bad buffer size). I've tried to increase the size of the buf-Array and it appears that when i change the size, the wierd characters printed change. The problem is removed when size reaches 32 bytes.

Can someone explain to me in detail why this is happening? I've understood so far that read() does not read \'0' when it reads something, and that the third parameter of read() indicates the maximum number of bytes to read.

Antoher thing I've noticed while experimenting with the above code is the following: Let's assume one changes the third parameter (maximum bytes to read) of read() to 3, and the size of buf-Array to 512 (overkill i know, but I really wanted to see what will happen). Now read will acutally read a third character (in my case 'e') and store it into the buffer, even tho this third character does not exist.

I've searched for a while now @stackoverflow and I found many similiar cases, but none of them made me understand my problem. If there is any other thread i missed, it would be a pleasure if u could link me to it.

At last: sry for my bad english, it's not my native language.

Tmirror
  • 33
  • 6

1 Answers1

1

Clearly you need to make buf 3 bytes long and use the last byte as the null byte (0 or '\0'). That way, when you print the string, your computer doesn't carry on until he finds another 0 !

The way strings (char arrays really) are handled in C is quite straightforward. Indeed, when dealing with strings (most) if not all functions take under the assumption that string parameters are null terminated (puts) or return null terminated strings (strdup).

The point is that, by default the computer can't tell where a string ends unless it is given the strings size each time he processes it. The easiest implementation around this approach was to append after each string a 0 (namely the null byte). That way, the computer just need to iterate over the string's characters and stop when he finds the termination character (other name for null byte).

TDk
  • 1,019
  • 6
  • 18
  • Let's assume i hard code a char array char anotherBuf[2] and assign H at [0] and i at [1]. Then i can use printf("%s", anotherBuf) and it still prints the char Array correctly even without the terminating 0. It does not search for the "next" '\0' it can find. Why is it not the same with my char Array? In fact: it only should contain 'H' and 'i' after read(). – Tmirror Sep 20 '17 at 15:51
  • @Tmirror That test was not guaranteed to work either; it was just by chance that there happened to be a null byte soon after `anotherBuf` for `printf` to find and stop on. – Steve Summit Sep 20 '17 at 15:56
  • Printing a non-null-terminated string in C is **guaranteed** to give you extra rubbish as long as no null byte is found – TDk Sep 20 '17 at 15:57