9

After looking at the kernel docs here: https://www.kernel.org/doc/Documentation/PCI/pci.txt I am lost as to the ordering of function calls to set up and tear down a PCI driver.

I have two questions:

  1. For setup, does pci_enable_device() always come before pci_request_regions()? The documentation seems to point to this fact, but does state:

    OS BUG: we don't check resource allocations before enabling those resources. The sequence would make more sense if we called pci_request_resources() before calling pci_enable_device(). Currently, the device drivers can't detect the bug when when two devices have been allocated the same range. This is not a common problem and unlikely to get fixed soon. This has been discussed before but not changed as of 2.6.19: http://lkml.org/lkml/2006/3/2/194

    However, after doing a quick look through of the source code of several drivers, the consensus is that pci_enable_device() always comes first. Which one of these calls is supposed to come first and why?

  2. For tearing down the driver, I get even more confused. Assuming pci_enable_device() comes first, I would expect that you first call pci_release_regions() prior to calling pci_disable_device() (i.e., following some symmetry). However, the kernel docs say that pci_release_regions() should come last. What makes matters more complicated is that I looked at many drivers and almost all of them had pci_release_regions() before pci_disable_device(), like I would expect. However, I then stumbled across this driver: https://elixir.bootlin.com/linux/v4.12/source/drivers/infiniband/hw/hfi1/pcie.c (code is reproduced below).

    void hfi1_pcie_cleanup(struct pci_dev *pdev)
    {
        pci_disable_device(pdev);
        /*
         * Release regions should be called after the disable. OK to
         * call if request regions has not been called or failed.
         */
        pci_release_regions(pdev);
    }
    

    Which function is supposed to come first when tearing down the driver? It seems that drivers in the kernel itself can't agree.

It'sPete
  • 5,083
  • 8
  • 39
  • 72
  • 2
    I guess the author either know something which no-one else does, or brought some stupid comment to the code. – 0andriy Jul 04 '18 at 23:23
  • 1
    One way to think about this is that pci regions and pci device status are not really that connected. The system will discover the "regions" via the PCI ID mechanics and can reserve them even if device is disabled. On the other hand, device can be enabled (have power) but with pci bridge not properly configured to handle it's "regions" it won't be able to communicate with the system, thus sitting there useless (or doing something on it's own, for all we know :-). – oakad Jul 06 '18 at 02:50
  • 1
    I have notified PCI subsystem maintainer, I think we would clarify this soon in upstream. – 0andriy Jul 23 '18 at 10:37

1 Answers1

2

The statement that gives a final say is as follows :

o wake up the device if it was in suspended state,

o allocate I/O and memory regions of the device (if BIOS did not),

o allocate an IRQ (if BIOS did not).

So, it makes no sense to ask kernel to reserve resource if there is none. In most cases when we do not need to allocate the resource because it has been done by bios, in those cases we can keep either function first but do only if you are absolutely sure.

Community
  • 1
  • 1
yashC
  • 887
  • 7
  • 20