1

I know it's very dumb but I really don't get what the heck is happening here.

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>

int getinput()
{
    char buf[10];
    int rv = read(0, buf, 1000);
    printf("\nNumber of bytes read are %d\n", rv);
    return 0;
}

int main()
{
    getinput();
    return 0;
}

I can't understand how this read() function is working.

read(0, buf, 1000)

Also, the buf is 10 bytes long why it is taking 23 bytes?

IrAM
  • 1,720
  • 5
  • 18

3 Answers3

3

Array-pointer equivalence

In C, an array like the variable buf in your example is just a pointer to the memory address of the first allocated byte.

You can print the value of this pointer:

#include <stdio.h>

int main(void) {
    char buf[10];
    printf("Address of the first byte of buf: %p\n", buf);
    return 0;
}

Output:

Address of the first byte of buf: 0x7ffd3699bfb6

Pointer arithmetic

When you write something into this buffer with an instruction like

buf[3] = 'Z';

It is in fact translated to

*(buf+3) = 'Z';

It means "add 3 to the value of the pointer buf and store the character 'Z' at the resulting address".

Nothing is stopping you from storing the character 'Z' at any given address. You can store it before or after the address pointed to by buf without any restriction. If the address you choose happen to be the address of another variable, it cannot produce a segmentation fault (the address is valid).

In C, you can even write the character 'Z' at the address 123456 if you like:

int main(void) {
    char *address = (char *)123456;
    *address = 'Z';
    return 0;
}

The fact that your buffer is 10 bytes long does not change that. You cannot "fix" this because writing anything at any memory location is a fundamental feature of the C programming language. The only "fix" I can think of would be to use another language.

File descriptors opened at program startup

In your exemple, you pass the value 0 as the first argument of the function read(). It seems that this value corresponds to the file descriptor of the standard input. This file descriptor is automatically opened at program startup (normally you get such a file descriptor as the result of a call to the function open()). So, if you get 23 read bytes, it means that you typed in 23 characters on your keyboard during the program execution (for instance, 22 letters and 1 newline character).

It would be better to use the macro name of the standard input file descriptor:

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>

int getinput()
{
    char buf[10];
    int rv = read(STDIN_FILENO, buf, 10);
    printf("\nNumber of bytes read are %d\n", rv);
    return 0;
}

int main()
{
    getinput();
    return 0;
}
RalphS
  • 627
  • 4
  • 15
  • 1
    Just to make my concept more clear it means that this behavior can be changed when we will recompile it also the fact it gives segmentation fault has nothing to do with computer right ? – Arslan Iftikhar Dec 04 '20 at 08:50
  • @ArslanIftikhar : Even without recompiling the program, the behavior can change between two executions of the same binary program. And no, the fact that it gives a segmentation fault does not mean that the computer has a problem, it just means that your program has a bug. – RalphS Dec 04 '20 at 09:16
  • 1
    ok got it and i know it does not mean computer has error i thought it has something to with computer buffer register anyways thanks my friend – Arslan Iftikhar Dec 04 '20 at 09:20
  • Thanks for ur help my friend – Arslan Iftikhar Dec 04 '20 at 11:13
1

your sample is a perfect example of a buffer overflow.

read(0, buff, 1000) will most probably corrupt the memory (stack on your case).

Read will take the start address of your buf pointer and will write those 23 bytes in your case... if there are some other structures on the memory they will be overwritten by those 13 bytes and can lead to very unwanted behavior (maybe even crashes of you application)

Bernd Farka
  • 502
  • 2
  • 9
1

C gives the responsibility to handle memory correctly to the programmer. So there is no bounds checking.

You call read() with 3 arguments:

  1. The file handle, in your case "0".
  2. The pointer to the array of bytes to fill with the bytes read from the file, in your case buf.
  3. The size of this array, in your case "1000".

Apparently the file has only 23 bytes, which is less or equal to 1000, so read() returns this value.

Note: But before, it happily wrote all these 23 bytes into the array. Since your buffer has just a capacity of 10 bytes, the memory after it gets overwritten. This is called "buffer overflow" and is a common error, abused for evil attacks, or possibly leading to crashes or malfunction (Thanks, ikegami!).

To fix this error, I recommend to change the read into:

    read(0, buf, sizeof buf);

This way your are always giving the right size to read(). (If you declare buf as an array, of course.)

the busybee
  • 10,755
  • 3
  • 13
  • 30
  • Re "*abused for evil attacks*", Or more commonly, a reason programs crash and/or malfunction. – ikegami Dec 04 '20 at 07:44
  • I get it it Will wrote 23 but it will not give segmentation fault even after 12 bytes or 13 but only 23 and on one machine it's giving segmentation fault at 21 bytes – Arslan Iftikhar Dec 04 '20 at 08:45
  • @ArslanIftikhar undefined behaviour is undefined. If you write outside of a buffer then anything can happen: for example nothing, a crash, program behaves strangely even later, some other unrelated variables change apparently for no reason etc. – Jabberwocky Dec 04 '20 at 08:49