I've been working with assembly and have been working with file IO. From what I've learned, the process goes as follows. CPU makes a system call to the kernel to open a file ie "hello.txt". The kernel then finds that location in the filesystem (persistent memory), makes it accessible for read and/or write, and returns a file descriptor that uniquely identifies that file. From my understanding the file descriptor is an index for a table that stores file data. My question is: what data is stored? presumably storing the entire file data would get grossly memory expense for large files. Does it store file metadata like mime-type, encoding, etc? Or does it actually store the whole contents?
Asked
Active
Viewed 50 times
0
-
That depends on the operating system in question, but generally the file data itself is not stored (files can exceed the available memory). Would you be happy with an explanation taking one operating system as an example? – fuz Mar 17 '22 at 15:00
-
2You can look at how popular operating systems implement it: Here are the [high-level descriptions of the contents in Windows](https://docs.microsoft.com/en-us/windows/win32/fileio/file-objects), here is the [actual C `struct FILE_OBJECT` with all its fields in Windows](https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/wdm/ns-wdm-_file_object); here is the [C `struct file` with all its fields in Linux](https://elixir.bootlin.com/linux/latest/source/include/linux/fs.h#L962) and here is [a YouTube video that goes behind the scenes in Linux](https://www.youtube.com/watch?v=rW_NV6rf0rM) – CherryDT Mar 17 '22 at 15:01
-
1See also this question: [What are file descriptors, explained in simple terms?](https://stackoverflow.com/questions/5256599/what-are-file-descriptors-explained-in-simple-terms) - And one more link: [Annotated `filedesc` and `file` structs in FreeBSD](https://chenshuo.com/notes/kernel/file-descriptor-table/#freebsd-up-to-93). - All assuming you want to "get a feeling" for what kind of data there is in different systems, not 100% understand each field. – CherryDT Mar 17 '22 at 15:02
-
The information stored by the operating system will describe, among other things: * the location of the file on disc, * the mode & sharing/locking properties of the open file, e.g. among other processes, * the seek position within the file, e.g. which bytes will be fetched on next read, – Erik Eidt Mar 17 '22 at 15:30
-
A *descriptor* doesn't include the file data, just a *reference* to the file (and a current position, open mode, append flag, and other possible stuff). User-space buffered IO like C `stdio.h` has read/write buffers inside your process, as part of a `FILE` struct. But that's a C FILE object, not an-level OS file descriptor. – Peter Cordes Mar 17 '22 at 18:17
-
It stores whatever you can imagine the kernel needs in order to do `read` and `write`. For example, where the file is on the disk. – user253751 Mar 21 '22 at 18:25