14

In LDD3, i saw such codes

static unsigned int scull_p_poll(struct file *filp, poll_table *wait)
{
    struct scull_pipe *dev = filp->private_data;
    unsigned int mask = 0;

    /*
     * The buffer is circular; it is considered full
     * if "wp" is right behind "rp" and empty if the
     * two are equal.
     */
    down(&dev->sem);
    poll_wait(filp, &dev->inq,  wait);
    poll_wait(filp, &dev->outq, wait);
    if (dev->rp != dev->wp)
        mask |= POLLIN | POLLRDNORM;    /* readable */
    if (spacefree(dev))
        mask |= POLLOUT | POLLWRNORM;   /* writable */
    up(&dev->sem);
    return mask;
}

But it says poll_wait won't wait and will return immediately. Then why do we need to call it? Why can't we just return mask?

demonguy
  • 1,977
  • 5
  • 22
  • 34

3 Answers3

19

poll_wait adds your device (represented by the "struct file") to the list of those that can wake the process up.

The idea is that the process can use poll (or select or epoll etc) to add a bunch of file descriptors to the list on which it wishes to wait. The poll entry for each driver gets called. Each one adds itself (via poll_wait) to the waiter list.

Then the core kernel blocks the process in one place. That way, any one of the devices can wake up the process. If you return non-zero mask bits, that means those "ready" attributes (readable/writable/etc) apply now.

So, in pseudo-code, it's roughly like this:

foreach fd:
    find device corresponding to fd
    call device poll function to setup wait queues (with poll_wait) and to collect its "ready-now" mask

while time remaining in timeout and no devices are ready:
    sleep

return from system call (either due to timeout or to ready devices)
Gil Hamilton
  • 11,973
  • 28
  • 51
  • Then when does the process sleep? – demonguy May 15 '15 at 01:22
  • You mean, poll call from user space will block the process, right ? – demonguy May 15 '15 at 15:00
  • 1
    Yes. When you call poll(2) in user space, that goes to a function called "sys_poll" inside the kernel (see fs/select.c in kernel source). Likewise, select(2) => sys_select, etc. All those functions follow more or less the pseudo-code I gave above. – Gil Hamilton May 15 '15 at 15:12
  • I have a question: what does wait_queue_head_t do? void poll_wait (struct file *, wait_queue_head_t *, poll_table *); – Kevin Ding Apr 15 '21 at 02:41
  • 1
    It's a data structure that anchors the head of the queue of "waiting processes" (within this device). So that if an interrupt comes in that delivers data (for Read) or frees up space (for Write), the device can notify the core kernel that any waiting process on the queue can be awakened (which would result in each process being unblocked [scheduled to run] and hence cause a return to user space from the select/poll syscall that the process in the queue is blocked in). – Gil Hamilton Apr 15 '21 at 15:34
3

The poll file_operation sleeps if you return 0

This is what was confusing me.

When you return non-zero, it means that some event was fired, and it wakes up.

Once you see this, it is clear that something must be tying the process to the wait queue, and that thing is poll_wait.

Also remember that struct file represents "a connection between a process and an open file", not just a filesystem file, and as such it contains the pid, which is used to identify the process.

Playing with a minimal runnable example might also help clear things up: https://stackoverflow.com/a/44645336/895245

Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
-2

poll_wait triggers when there is an expected event occurred on any of the fd's it is waiting on OR it hits timeout.

Check the mask to know which event triggered poll_wait. If you don't want poll_wait to trigger on such event, you can configure it while registering file descriptor to poll fd.

Srikanth
  • 517
  • 3
  • 10
  • 29
  • 1
    This is completely wrong. `poll_wait` doesn't 'trigger' at all. It simply adds a wait queue to the `poll_table`. – EML Mar 25 '16 at 17:13