4

I have written a device driver kext for a hot-plug SCSI device, based somewhat on Wagerlabs code (using a driver-user client-application model) and everything works. The only remaining concern is that the driver appears not to be consistently freed, especially if the application crashes. For example, when I try to unload the kext, even with the device disconnected and the application closed, there are still outstanding instances of the driver and user client (with the driver generally outnumbering the user client).

I have logging in the driver functions like free(), and when I shut down the computer, I can see these being executed, so the instances can obviously still be terminated. What is the "right" way to ensure the driver instance is terminated and freed, even if the host application crashes, terminates improperly or things generally don't go to plan?

Inductiveload
  • 6,094
  • 4
  • 29
  • 55

1 Answers1

5

If you've got user client class instances when no user client app is running, then you're definitely retaining the user client instances more often than you're releasing them. For example, you might be keeping a retained reference to client instances in the main driver class. In your user client class's stop() method, make sure to remove that client instance from the driver.

Another thing to watch out for: make sure you call superclass implementations from your overridden versions of the built-in IOService methods such as stop(), free() etc. Not doing so will usually put the IO Kit into an inconsistent state.

Finally, a useful technique for debugging retain leaks in I/O Kit drivers, is to actually log the retains and releases by overriding the methods with logging versions:

void MyClass::taggedRetain(const void* tag) const
{
    OSReportWithBacktrace(
        "MyClass" CLASS_OBJECT_FORMAT_STRING "::taggedRetain(tag=%p)\n", CLASS_OBJECT_FORMAT(this), tag);
    IOService::taggedRetain(tag);
}
void MyClass::taggedRelease(const void * tag) const
{
    OSReportWithBacktrace(
        "MyClass" CLASS_OBJECT_FORMAT_STRING "::taggedRelease(tag=%p)\n", CLASS_OBJECT_FORMAT(this), tag);
    int count = getRetainCount();
    IOService::taggedRelease(tag);
    if (count == 1)
        printf(
            "MyClass::taggedRelease(tag=%p) final done\n", tag);
    else
        printf(
            "MyClass" CLASS_OBJECT_FORMAT_STRING "::taggedRelease(tag=%p) done\n", CLASS_OBJECT_FORMAT(this), tag);
}

The macros in this code are defined in a header as follows:

#define CLASS_OBJECT_FORMAT_STRING "[%s@%p:%dx]"
#define CLASS_OBJECT_FORMAT(obj) myClassName(obj), obj, myRefCount(obj)

inline int myRefCount(const OSObject* obj)
{
    return obj ? obj->getRetainCount() : 0;
}

inline const char* myClassName(const OSObject* obj)
{
    if (!obj) return "(null)";
    return obj->getMetaClass()->getClassName();
}
#endif

I should explain that taggedRetain() and taggedRelease() are the actual underlying implementation of retain() and release() - if you override the latter, you won't see any retains and releases coming from OSCollections, as they use the tagged versions (with a non-null tag).

The backtrace generated by OSReportWithBacktrace() is unfortunately just a bunch of hex pointers, but you can look those up using gdb.

In any case, by logging retains and releases for your objects, you can go through all retains and make sure they are matched by a release in the right place. Watch out for cycles!

pmdj
  • 22,018
  • 3
  • 52
  • 103
  • Thanks for such a detailed answer! I may have found the leak of the user clients (IOService's not being freed by the application). However, I am still getting a leak of the driver class itself, which appears to be caused by a failure to get a call to `free()` after `detach()`. I get a call to free if I plug the device in and remove it again without the userspace application open, but if the user client ever gets opened, it won't get a call to free in the end. What could cause that, a retained reference to the driver? – Inductiveload Nov 20 '12 at 17:34
  • Yes, `free()` not being called is always a missing `release()`, the cause is often not so obvious. Make sure your user client object releases its provider if it retains it. If that's not the problem, is `terminate()` being called on your main object? `registerService()` seems to add a reference, which is only released by `terminate()`. – pmdj Nov 20 '12 at 18:37
  • `terminate()` doesn't appear to be being called in the cases when it goes wrong - what is this normally called by, and should I be looking to find out why something else didn't call it, or call it myself somewhere? – Inductiveload Nov 20 '12 at 18:46
  • It depends on what kind of driver it is. For something like a USB device, if you unplug it, the IOUSBInterface provider object will call `terminate()` on its clients (your driver instance). Similarly for anything else that builds on something where the underlying IOKit family supports hotplugging out of the box. If it's a purely custom or virtual device, and you called `registerService()`, you will need to call `terminate()` yourself when the device itself disappears. – pmdj Nov 22 '12 at 10:01
  • The driver is derived from IOSCSIPeripheralDeviceNub, and it is a USB device and appears under IOUSBInterface in ioreg, but terminate is not called when it is unplugged. Is there a special way to make my driver take notice of the unplugging (since it isn't a USB driver as such)? – Inductiveload Nov 23 '12 at 09:58
  • I haven't worked with SCSI drivers myself, so I can't comment on that particular aspect of it, but I doubt it matters. USB nubs will inform their clients of device removal by calling `terminate()`, then `stop()`, `detach()` and in theory eventually `free()`. Having user clients complicates this, but `terminate()` should always be called in this case. Are you sure you're logging this from your main driver class? To truly free everything, you'll need to message your user clients to say that the device is terminating. In userspace you should then `IOObjectRelease` etc. – pmdj Nov 23 '12 at 13:27
  • Regarding "messaging clients": You do this via `this->messageClients(kIOMessageServiceIsTerminated);` and in userspace, register for those notifications on your service object. When you receive the `kIOMessageServiceIsTerminated` notification, get rid of all your userspace handles for that device. – pmdj Nov 23 '12 at 13:34