3

While trying to learn socket programming, I saw the following code:

int sock;
sock = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP);

I browsed through the man page and found that socket returns a file descriptor. I have tried searching the internet and other similar questions here but I couldn't understand what file descriptor really is. That would be great if someone could explain file descriptor in easy language.

S.B
  • 13,077
  • 10
  • 22
  • 49
Razin
  • 221
  • 6
  • 15
  • I recommend reading a structured introduction into complex subjects like network programming. If you can afford a book or have access to a library, read Stevens' *Advanced Programming in the UNIX Environment.* If you do a lot of network programming, check out his network programming books. Man pages are great as a a reference but poor for getting the overall idea. – Peter - Reinstate Monica Nov 29 '16 at 11:37

5 Answers5

16

There are two related objects: file descriptor and file description. People often confuse these two and think they are the same.

File descriptor is an integer in your application that refers to the file description in the kernel.

File description is the structure in the kernel that maintains the state of an open file (its current position, blocking/non-blocking, etc.). In Linux file description is struct file.

POSIX open():

The open() function shall establish the connection between a file and a file descriptor. It shall create an open file description that refers to a file and a file descriptor that refers to that open file description. The file descriptor is used by other I/O functions to refer to that file. The path argument points to a pathname naming the file.

The open() function shall return a file descriptor for the named file that is the lowest file descriptor not currently open for that process. The open file description is new, and therefore the file descriptor shall not share it with any other process in the system.

Joachim Sauer
  • 302,674
  • 57
  • 556
  • 614
Maxim Egorushkin
  • 131,725
  • 17
  • 180
  • 271
  • "File descriptor is an integer in your application that refers to the file description in the kernel." There are multiple file descriptors ; do they all refer to a single file description? – ki9 Mar 03 '20 at 18:37
  • 2
    @Keith `dup`creates a new descriptor for the same description, unlike `open`. – Maxim Egorushkin Mar 03 '20 at 18:39
4

In Unix/ Linux operating systems, a file descriptor is an abstract indicator (handle) used to access a file or other IO(input/output) resource, such as a pipe or network socket. Normally a file descriptors index into a per-process file descriptor table maintained by the kernel in Linux/Unix OS, that in turn indexes into a system-wide table of files opened by all processes, called the file table. This table records the "mode" with which the file or the other resource has been opened for the following operations(There are more operations)

  • reading
  • writing
  • appending
  • writing

and possibly other modes. It also indexes into a third table called the inode table that describes the actual underlying files.

4

File Descriptors are nothing but mappings to a file. You can also say these are pointers to a file that the process is using.
FDs are just integer values which act as pointers to process resources.

Whenever a process starts, an entry of the running process is added to the /proc/<pid> directory. This is the place where all of the data related to the process is kept. Also, on process start the kernel allocates 3 file-descriptors to the process for communication with the 3 data streams referred to as stdin, stdout and stderr.
the linux kernel uses an algorithm to always create a FD with the lowest possible integer value so these data-streams are mapped to the numbers 0, 1 and 2.

Let's say in you code you opened a file to read from or to write to. This means the process needs access to a resource and it has to create a mapping/pointer for this new resource.
To do this, the kernel automatically creates a FD as soon as the file is opened by your code.

If you run ls -l /proc/<pid>/fd/ you will se an additional FD created there with id 4 (can be some other number also if the program has used other resources)

swayamraina
  • 2,958
  • 26
  • 28
3

I think of file descriptors as (indirect, higher-level) pointers to opaque file objects maintained by the kernel.

Normally, when you deal with objects maintained by a library, you pass to the library pointers to objects that you're not supposed to dereference and manipulate yourself.

For kernel objects, this it's not just that you're not supposed to manipulate them yourself -- you literally can't because they live in a different address space that's not at all accessible to you. And because they live in a different address space, pointers wouldn't be a meaningful way of referring to them.

You need a token or handle which the kernel would internally resolve to a pointer that's meaningful in the kernel address space. File descriptors are such tokens in integer form.

For the kernel:

your_process_id + your_file_descriptor => kernels_file_object_pointer

(or an EBADF error if a given filedescriptor may not be resolved to a file object pointer for the given process)

Petr Skocik
  • 58,047
  • 6
  • 95
  • 142
0
  • File descriptor is a number associated with an open file.

  • It is just like an index or entry number in File descriptor table.

  • File descriptor tables are per-process but can be shared between
    processes too.

  • Entry in File descriptor table had pointer to entry in File description table which is system-wide.

  • Entry in File description table contain actual data about file: access mode, offset and status flags of a file and a pointer to entry in I-Node table which hold info about actual location of the file on storage media (block number).

enter image description here

Dražen G.
  • 358
  • 3
  • 10