3

To extend the title.I am wondering how the OS handles functions like fwrite,fread,fopen and fclose.

What is actually a stream?

Sorry if I was not clear enough.

BTW I am using GNU/Linux Ubuntu 11.04.

A bit better explanation of what I am trying to ask.

I want to know how are files written to HDD how are read into memory and how can is later a handle to them created.Is BIOS doing that through drivers?

  • The OS will pass this call to an implementing file system driver, which in turn passes it to the underlying storage's device driver. – Santa Aug 23 '11 at 18:49
  • 1
    Can you provide some more detail on what you're looking for? What are you trying to solve? Do you want to know where you can find the Linux source code for the implementations of fwrite and friends? Do you have a specific question about the implementation of fwrite? Are you on an embedded system? – Rian Sanderson Aug 23 '11 at 18:54
  • @Santa.What is the underlying storage's device driver? –  Aug 23 '11 at 19:03

2 Answers2

8

The C library takes a function like fopen and converts that to the proper OS system call. On Linux that is the POSIX open function. You can see the definition for this in a Linux terminal with man 2 open. On Windows the call would be CreateFile which you can see in the MSDN documentation. On Windows NT, that function is in turn another translation of the actual NT kernel function NtCreateFile.

A stream in the C library is a collection of information stored in a FILE struct. This is usually a 'handle' to the operating system's idea of the file, an area of memory allocated as a 'buffer', and the current read and write positions.

I just noticed you tagged this with 'assembly'. You might then want to know about the really low level details. This seems like a good article.

Now you've changed the question to ask about even lower levels. Well, once the operating system gets a command to open a file, it passes that command to the VFS (Virtual File System). That piece of the operating system looks up the file name, including any directories needed and does the necessary access checks. If this is in RAM cache then no disk access is needed. If not, the VFS sends a read request to the specific file system which is probably EXT4. Then the EXT4 file system driver will determine in what disk block that directory is located in. It will then send a read command to the disk device driver.

Assuming that the disk driver is AHCI, it will convert a request to read a block into a series of register writes that will set up a DMA (Direct Memory Access) request. This looks like a good source for some details.

At that point the AHCI controller on the motherboard takes over. It will communicate with the hard disk controller to cooperate in reading the data and writing into the DMA memory location.

While this is going on the operating system puts the process on hold so it can continue with other work. The hardware is taking care of things and the CPU isn't required to pay attention. The disk request will take many milliseconds during which the CPU can run millions of instructions.

When the request is complete the AHCI controller will send an interrupt. One of the system CPUs will receive the interrupt, look in its IDT (Interrupt Descriptor Table) and jump to the machine code at that location: the interrupt handler.

The operating system interrupt handler will read some data, find out that it has been interrupted by the AHCI controller, then it will jump into the AHCI driver code. The AHCI driver will read the registers on the controller, determine that the read is complete, put a marker into its operations queue, tell the OS scheduler that it needs to run, then return. Nothing else happens at this point.

The operating system will note that it needs to run the AHCI driver's queue. When it decides to do that (it might have a real-time task running or it might be reading networking packets at the moment) it will then go read the data from the memory block marked for DMA and copy that data to the EXT4 file system driver. That EXT4 driver will then return the data to the VFS which will put it into cache. The VFS will return an operating system file handle to the open system call, which will return that to the fopen library call, which will put that into the FILE struct and return a pointer to that to the program.

Community
  • 1
  • 1
Zan Lynx
  • 53,022
  • 10
  • 79
  • 131
  • What does the part of the operating system's idea of file mean?How OSes treat file.Ordinary users see it well as a file.I see it is a string. –  Aug 23 '11 at 18:59
  • Thank you.Just added this to fit the 15 chars limit. –  Aug 23 '11 at 19:50
  • Great answer, I was just wondering, when the CPU recieves an interrupt does the IDT (Interrupt Descriptor Table) entry that the OS runs depends on the device? Or it will only unblock the driver for it to do its job? – fredcrs Apr 22 '13 at 22:24
  • 1
    @fredcrs: Devices can share interrupts which makes the system have to check every device on that interrupt. More modern systems get away from that which saves time. What code actually runs depends on the driver. – Zan Lynx Apr 23 '13 at 02:45
0

fopen et al are usually implemented on top of OS-specific system calls. On Unix, this means the APIs for working with file descriptors: open, read, write, close, and a few others. On Windows, it's CreateFile, ReadFile, etc.

Marcelo Cantos
  • 181,030
  • 38
  • 327
  • 365