68

I'm reading source code of the linux tool badblocks. They use the read() function there. Is there a difference to the standard C fread() function? (I'm not counting the arguments as a difference.)

AIB
  • 5,894
  • 8
  • 30
  • 36
Georg Schölly
  • 124,188
  • 49
  • 220
  • 267

7 Answers7

72

read() is a low level, unbuffered read. It makes a direct system call on UNIX.

fread() is part of the C library, and provides buffered reads. It is usually implemented by calling read() in order to fill its buffer.

roschach
  • 8,390
  • 14
  • 74
  • 124
Darron
  • 21,309
  • 5
  • 49
  • 53
  • 4
    So there are 3 buffers? The harddrive has one, /dev/hda is buffered too and fread. Is this correct? – Georg Schölly Feb 24 '09 at 23:52
  • 8
    yes. you can flush the third one using "fflush", the second one using fsync. i don't know of a way to flush the harddrive buffer. – Johannes Schaub - litb Feb 25 '09 at 00:16
  • 3
    fflush() only really applies to fwrite(), which has the same relation to write() that fread() has to read(). – Darron Feb 25 '09 at 11:50
  • @Darron im more of a Linux guy and question was about Linux. It can differ per OS but in general one can assume that in Linux fread calls read. – Jānis Gruzis Jun 30 '16 at 11:38
  • @JohannesSchaub-litb Is my understanding correct: read is not actually a completely "unbuffered" read, in the sense that it enjoys at least hard drive disk buffer and os-level buffer? The only buffer it misses, is the library-level buffer which is offered by fread? – torez233 Oct 28 '22 at 07:08
49

Family read() -> open, close, read, write
Family fread() -> fopen, fclose, fread, fwrite

Family read:

  • are system calls
  • are not formatted IO: we have a non formatted byte stream

Family fread

  • are functions of the standard C library (libc)
  • use an internal buffer
  • are formatted IO (with the "%.." parameter) for some of them
  • use always the Linux buffer cache

More details here, although note that this post contains some incorrect information.

roschach
  • 8,390
  • 14
  • 74
  • 124
AIB
  • 5,894
  • 8
  • 30
  • 36
  • 7
    The last two bullet points in both the read and fread lists are nonsense. Both families use the buffer cache by default, and which one to use has **nothing** to do with whether you are accessing a character device, a block device, or a regular file. – Marcus Dec 17 '15 at 15:47
  • 4
    AIB is confusing two layers of buffering -- the kernel buffering happens in both cases (what I would normally call the Linux buffer cache) but buffering that is done in userspace to reduce the total number of system calls I think only happens with fread. – Joseph Garvin Jan 15 '16 at 15:40
  • @Marcus I've removed most of the misconceptions from the answer. –  Mar 08 '17 at 18:37
  • Technically family read are not syscalls. It's the standard C library (libc) too. `read()` is a bit "thinner" wrapper over the syscall than `fread()` – red0ct Oct 09 '22 at 16:47
10

As I remember it the read() level APIs do not do buffering - so if you read() 1 byte at a time you will have a huge perf penalty compared to doing the same thing with fread(). fread() will pull a block and dole it out as you ask for it. read() will drop to the kernel for each call.

roschach
  • 8,390
  • 14
  • 74
  • 124
Joe
  • 2,946
  • 18
  • 17
8

read is a syscall, whereas fread is a function in the C standard library.

phihag
  • 278,196
  • 72
  • 453
  • 469
  • 1
    @Jānis Gruzis, that depends on the implementation of `fread`. Mainly because there is no guarantee that the system call `read` is available. [Wikipedia `read` page](http://en.wikipedia.org/wiki/Read_\(system_call\)) – bzeaman Oct 31 '14 at 10:05
  • @JānisGruzis I just checked if it's the same on Windows and to my amazement I discovered that [**their `read` is deprecated**](https://msdn.microsoft.com/en-us/library/ms235412.aspx). Presumably their [`fread`](https://msdn.microsoft.com/en-us/library/kt0etdcs.aspx) calls `_read` and not `read`? In any case it seems that not every `fread` must call `read`. – rsp Jun 29 '16 at 23:15
5

One difference you should be aware of if you are converting code that uses one to using the other:

  • fread blocks until the number of bytes you asked for has been read, or the file ends, or an error occurs.
  • read also blocks, but if you ask for say 4kB it may return after reading just 1kB, even if the file has not ended.

This can cause subtle bugs, as it depends on where the file is stored, caches, etc.

Tor Klingberg
  • 4,790
  • 6
  • 41
  • 51
1

read() --> Directly using this system call to kernel and that performs the IO operation.

fread() --> Is a function provided in standard library.

Calling fread() is mainly used for binary file data where struct data are stored. The main difference between these two is the number of system calls in your application.

The fread() kind of standard IO library functions are optimized for system calls, rather your application making system calls.

Adrian Mole
  • 49,934
  • 160
  • 51
  • 83
chakra t
  • 27
  • 4
0

For Beginners like me in C/Systems programming domain. I am roughly quoting the answer from the lecture around this timestamp by Professor John Kubiatowicz.

fread is a high level C-API that internally uses the low-level read system call in an optimized way.

Imagine your system is optimized to read 4k bytes at a time. When you use fread to read from a file in a while loop, you will initiate read system call once to get a chunk of 4k bytes from the kernel and save it in user buffer. Now, all the subsequent reading for upto 4k bytes will happen from that user buffer. This is good because system calls are expensive.

This is also highlighted by the comment from @Joseph Garvin in his comment above.

bad programmer
  • 818
  • 7
  • 12