-1

I'm working with a kernel module, which is causing a null pointer dereference when inserted. I've tracked the error down to this function in swnode.c (not part of the kernel module, but is called by the module):

static struct fwnode_handle *
software_node_get_next_child(const struct fwnode_handle *fwnode,
                 struct fwnode_handle *child)
{
    struct swnode *p = to_swnode(fwnode);
    struct swnode *c = to_swnode(child);

    if (!p || list_empty(&p->children) ||
        (c && list_is_last(&c->entry, &p->children)))
        return NULL;

    if (c) {
        c = list_next_entry(c, entry);
        if (c->node)
                pr_info("child node named %s\n", c->node->name);
    } else {
        c = list_first_entry(&p->children, struct swnode, entry);
    }

    return fwnode_handle_get(&c->fwnode);
}

I added the pr_info("child node named %s\n", c->node->name); call for debugging, and that line causes the null pointer dereference. Prior to that the error was with return fwnode_handle_get(&c->fwnode) which caused an oops complaining I was executing things in NX memory; it's apparent that c->fwnode is NULL, so I can try to work out why that is, I'm just wondering why my debug print caused an error too.

This situation confuses me; I'm explicitly checking that c and c->node are not null, in a way that I thought should protect against this kind of error (based on answers like this). The dereference operations are against c (because, by my understanding, c->member is equivalent to (*c).member) and c->node. So; why, given the pr_info call should only be evaluated if c and c->node are not null does it cause a null pointer dereference?

EDIT:

Close voters need to read both question and threads more carefully. This issue is not caused by a typo. The missing curly braces in the original example (which has since been edited to include them) are not the cause of the issue.

Adrian Mole
  • 49,934
  • 160
  • 51
  • 83
Dan Scally
  • 1,922
  • 1
  • 19
  • 31
  • 3
    You are missing curly braces after `if(c)`. Otherwise the `if(c->node)` should not be indented – perivesta Sep 03 '20 at 09:49
  • You don't check `c->node->name` at all. It can still be `NULL` even if `c->node != NULL`. – Gerhardh Sep 03 '20 at 09:56
  • @Gerhardh agreed, but that's not a null pointer dereference; in that case I'd just expect the string to print out "child node named (null)" - at least that's what usually happens – Dan Scally Sep 03 '20 at 09:58
  • Why would you expect to print `"(null)"` if you pass a `NULL` pointer? There is no requirement for `printf` to check before derefencing the pointer you pass for `%s`. Some libraries do it, but they are not required to do. – Gerhardh Sep 03 '20 at 09:59
  • @Gerhardh that's just the behaviour I've seen so far. I'm not especially au fait with C.] – Dan Scally Sep 03 '20 at 10:00
  • That's the problem. "This is what normally happens" sometimes is the entry into world of **undefined behaviour**. – Gerhardh Sep 03 '20 at 10:01
  • @Gerhardh fair enough, I'll test that. – Dan Scally Sep 03 '20 at 10:14
  • @Gerhardh so if I do `if (c->node->name) {pr_info("passed\n");}` it does throw the error on the if. That being the case though...how can I do a null pointer check to prevent the problem? – Dan Scally Sep 03 '20 at 10:45
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/220911/discussion-between-gerhardh-and-dan-scally). – Gerhardh Sep 03 '20 at 10:50
  • 1
    You posted code with an obvious bug, and received an answer pointing out that obvious bug. The proper way forward would be to (probably accept that answer and) ask a new question with corrected code and a question about any remaining bugs. Now you have moved the goalposts so that the answer you got is no longer valid. This reduces the incentive for others to answer, too - what if you change the question again once they find a bug in your code? – tripleee Sep 03 '20 at 12:36
  • This is one of the reasons we always insist on a [mre] - hopefully you could reduce the code in your next question to only have a single bug, or, even better, discover the solution yourself while working out which part isn't doing what you expect. – tripleee Sep 03 '20 at 12:37
  • @triplee I disagree with your characterisation of the events here, but if you don't feel like answering that's entirely your prerogative. – Dan Scally Sep 03 '20 at 12:47
  • I disagree with reopening this but closing it again is probably not feasible. You will probably have a better experience if you try to adhere to this site's rules and philosophy. – tripleee Sep 03 '20 at 12:52
  • 1
    FWIW: I cast the third Reopen vote on this question and, at the same time, edited out your comment that was addressed to the "Close Voters." I then had second thoughts, and rolled back the edit, so that the context was more apparent. I raised the issue in SOCVR, where there was a brief conversation. – Adrian Mole Sep 03 '20 at 13:01
  • That conversation has now been moved the the chatroom created here. – Adrian Mole Sep 03 '20 at 13:10
  • I suspect that the mystery of the null dereference shall be revealed by placing a breakpoint at the offending line, then inspect all involved variables in your favourite debugger. For example, perhaps `c->node->name` is NULL? – Lundin Sep 03 '20 at 13:26
  • 1
    @Gerhardh thanks again for your help in the chat; you were right. The value of `c->node` is not null but is also not a valid memory address. Weirdly the `IS_ERR` macros don't seem to actually catch it (when I print it, it shows the value 3 rather than a negative number, which is presumably why. No idea what it is in that case!); but regardless I'd happy for the explanation as to what was happening. Post an answer for a green tick. – Dan Scally Sep 03 '20 at 14:16
  • Can you provide a bit more details, i.e. what module code looks like that makes it go to this function? And what is the source of the properties (DT, ACPI, swnode, ...)? – 0andriy Sep 04 '20 at 10:12
  • 1
    @0andriy I can, but I'm going to open a new question for that. The chappie above spotted why the null ptr dereference was happening, so I think that closes this one and I'll make a new one for "why is this null to begin with" – Dan Scally Sep 04 '20 at 11:26
  • 1
    Please, do. And don't forget to provide necessary information, such as Linux kernel version in use and if there were patches applied, esp. if they affect the area in question. – 0andriy Sep 04 '20 at 12:10
  • @0andriy yep, will do – Dan Scally Sep 04 '20 at 12:11

1 Answers1

4

Your code

if (c)
    c = list_next_entry(c, entry);
    if (c->node)
            pr_info("child node named %s\n", c->node->name);
else
    c = list_first_entry(&p->children, struct swnode, entry);

is equivalent to

if (c) {
    c = list_next_entry(c, entry);
}
if (c->node) {
    pr_info("child node named %s\n", c->node->name);
} else {
    c = list_first_entry(&p->children, struct swnode, entry);
}

Therefore, c->node will be evaluated regardless of whether c is NULL.

Add braces to have the check for c work.

if (c) { /* add a brace */
    c = list_next_entry(c, entry);
    if (c->node)
            pr_info("child node named %s\n", c->node->name);
} /* add a brace */
else
    c = list_first_entry(&p->children, struct swnode, entry);
MikeCAT
  • 73,922
  • 11
  • 45
  • 70
  • argh. Thanks. My instinct is to always brace ifs, but then I see people skipping them and feel like I should too :P – Dan Scally Sep 03 '20 at 10:00
  • Anyhow; braces added, no dice. I still get the error. – Dan Scally Sep 03 '20 at 10:12
  • In that case, most likely `list_next_entry` is returning `null` and making `c` null, so `c->node` causes the error? – Fred Sep 03 '20 at 10:20
  • @Fred I have checked that with `if (!c) {pr_info("c is now null\n");}` and that printout never appears. – Dan Scally Sep 03 '20 at 10:38
  • 2
    @Fred It's a circular list with a head (non-element) node. `c = list_next_entry(c, entry);` will not set `c` to `NULL` but the result will only be valid if the original `c` was not the last element on the list. But this code is not reachable in that case due to earlier checks in the function (the `list_is_last()` call). – Ian Abbott Sep 04 '20 at 09:45