1

I'm studying about System Calls in Linux and I read the read() System Calls.

SYSCALL_DEFINE3(read, unsigned int, fd, char __user *, buf, size_t, count)
{
    struct file *file;
    ssize_t ret = -EBADF;
    int fput_needed;

    file = fget_light(fd, &fput_needed);
    if (file) {
        loff_t pos = file_pos_read(file);
        ret = vfs_read(file, buf, count, &pos);
        file_pos_write(file, pos);
        fput_light(file, fput_needed);
    }

    return ret;
}

This is the definition of fget_light()

struct file *fget_light(unsigned int fd, int *fput_needed)
 {
         struct file *file;
         struct files_struct *files = current->files;

         *fput_needed = 0;
         if (likely((atomic_read(&files->count) == 1))) {
                 file = fcheck_files(files, fd);
         } else {
                 rcu_read_lock();
                 file = fcheck_files(files, fd);
                 if (file) {
                         if (atomic_long_inc_not_zero(&file->f_count))
                                 *fput_needed = 1;
                         else
                                 /* Didn't get the reference, someone's freed */
                                 file = NULL;
                 }
                 rcu_read_unlock();
         }

         return file;
 }

Can you explain me, what does fget_light do?

Kahn Cse
  • 397
  • 2
  • 5
  • 10

2 Answers2

11

Each task has a file descriptor table. This file descriptor table is indexed by file descriptor number, and contains information (file descriptions) about each open file.

As many other objects in the kernel, file descriptions are reference-counted. This means that when some part of the kernel wants to access a file description, it has to take a reference, do whatever it needs to do, and release the reference. When the reference count drops to zero, the object can be freed. For file descriptions, open() increments the reference count and close() decrements it, so file descriptions cannot be released while they are open and/or the kernel is using them (e.g: imagine a thread in your process close()ing a file while another thread is still read()ing the file: the file description will not actually be released until the read fput()s its reference).

To get a reference to a file description from a file descriptor, the kernel has the function fget(), and fput() releases that reference. Since several threads may be accessing the same file description at the same time on different CPUs, fget() and fput() must use appropriate locking. In modern times they use RCU; mere readers of the file descriptor table incur no/almost no cost.

But RCU is not enough optimization. Consider that it's very common to have processes which are not multi-threaded. In this case you don't have to worry about other threads from the same process accessing the same file description. The only task with access to our file descriptor table is us. So, as an optimization, fget_light()/fput_light() don't touch the reference count when the current file descriptor table is only used in a single task.

struct file *fget_light(unsigned int fd, int *fput_needed)
{
     struct file *file;
     /* The file descriptor table for our _current_ task */
     struct files_struct *files = current->files;

     /* Assume we won't need to touch the reference count, 
      *  since the count won't reach zero (we are not close(), 
      *  and hope we don't run concurrently to close()),
      *  fput_light() won't actually need to fput().
      */
     *fput_needed = 0;

     /* Check whether we are actually the only task with access to the fd table */
     if (likely((atomic_read(&files->count) == 1))) {
             /* Yep, get the reference to the file description */
             file = fcheck_files(files, fd);
     } else {
             /* Nope, we'll need some locking */
             rcu_read_lock();
             /* Get the reference to the file description */
             file = fcheck_files(files, fd);
             if (file) {
                     /* Increment the reference count */
                     if (atomic_long_inc_not_zero(&file->f_count))
                             /* fput_light() will actually need to fput() */
                             *fput_needed = 1;
                     else
                             /* Didn't get the reference, someone's freed */
                             /* Happens if the file was close()d and all the 
                              *  other accessors ended its work and fput().
                              */
                             file = NULL;
             }
             rcu_read_unlock();
     }

     return file;
}
ninjalj
  • 42,493
  • 9
  • 106
  • 148
1

Basically, the function translates the fd passed by the user to the syscall to the kernel-internal file structure pointer by calling the fcheck_files function that looks into the file table of the process (that would be its files parameter). For more information, read this.

jpalecek
  • 47,058
  • 7
  • 102
  • 144