A way to efficiently parse function pointer declaration syntax

Question

So, until now, I was pretty sure my mental pointer-to-function parser was able to parse even the toughest of pointers... how wrong was I! While reading some legacy code I found this:

void (*(*somename)(void (*)()))(void (*)());

Apparently, it means declare somename as pointer to function (pointer to function returning void) returning pointer to function (pointer to function returning void) returning void (according to http://cdecl.org, at least).

It seems that I oversimplified the way function pointer declarations work. I was pretty sure the syntax is return-type(*variable-name)(argument-types...). It works for a lot of cases, but not for complex ones, like above. How could I go about reading such unordinary and complex declarations, without having to think about all the grammar rules and trying to figure out if I should read from left to right or in reverse or in some other weird way?

You're going to have to think hard about such brutal function declarations — there is no way to avoid it. That is horrid — obnoxiously horrid. The usual rule for parsing declarations of all sorts is called "the spiral rule". — Jonathan Leffler, Aug 26 '20 at 21:22
proper typedefs can hide that kind of ugly and make it almost readable. — Michael Dorgan, Aug 26 '20 at 21:22
See also [Understanding typedefs for function pointers in C](https://stackoverflow.com/q/1591361/15168). For the 'spiral rule', see [Clockwise/Spiral Rule](http://c-faq.com/decl/spiral.anderson.html) and also, on SO, [The Spiral Rule — when is it in error?](https://stackoverflow.com/q/16260417/15168). There are likely others — Google search term 'c declaration spiral rule' should help. — Jonathan Leffler, Aug 26 '20 at 21:31

score 3 · Answer 1 · answered Aug 26 '20 at 21:55

One of my professors taught us how to do this using the "right-left rule." He has documented this here.

Here is how I would apply it to this declaration (start by moving right from the identifier).

void (*(*somename)(void (*)()))(void (*)());
         +-------^                             somename
void (*(*somename)(void (*)()))(void (*)());
        ^--------+                             is pointer
void (*(*somename)(void (*)()))(void (*)());
       ^---------+                             (move left)
void (*(*somename)(void (*)()))(void (*)());
       +----------^                            to function
void (*(*somename)(void (*)()))(void (*)());
       +----------------------^                taking (void (*)())
void (*(*somename)(void (*)()))(void (*)());
      ^-----------------------+                returning pointer
void (*(*somename)(void (*)()))(void (*)());
     ^------------------------+                (move left)
void (*(*somename)(void (*)()))(void (*)());
     +-------------------------^               to function
void (*(*somename)(void (*)()))(void (*)());
     +------------------------------------^    taking (void (*)())
void (*(*somename)(void (*)()))(void (*)());
^-----------------------------------------+    returning void

You can then apply the rule to each argument in the argument lists, starting with whatever's in parenthesis since we don't have identifiers in this case:

void (*)()
      +^      pointer
void (*)()
     ^-+      (move left)
void (*)()
     +--^     to function
void (*)()
^--------+    returning void

score 1 · Answer 2 · answered Aug 26 '20 at 21:33

The trick is to use the clockwise spiral rule http://c-faq.com/decl/spiral.anderson.html A little bit difficult here because of so many parentheses however once you figure it out it should be fine.

Furthermore, you can also do aliasing of a part of the complex declaration with a label and go back to the label when you understand the rest. I mean:

void (*T)(void (*)());

where your T is substitute for (somename)(void ()())

Eric Postpischil · Answer 3 · 2020-08-26T21:40:06.593

The C grammar allows indefinite nesting of various things, so there is no limit on how much memory parsing a declaration may require. Tackling this one:

In void (*(*somename)(void (*)()))(void (*)()), we see there are function parameters, so let’s separate them a bit.
Proceeding from the right, we can find the match to the rightmost parenthesis and insert spaces for visualization: void (*(*somename)(void (*)())) (void (*)()).
So we see this declares (*(*somename)(void (*)())) to be a function returning void and taking a parameter of type void (*)(), which is a pointer to a void function with no prototype.
Next, analyze (*(*somename)(void (*)())). The left and right parentheses match, so this is *(*somename)(void (*)()).
That is a pointer to the previous thing (a voidfunction taking a pointer to a void function with no prototype).
If the rest is simple enough, we may see it is a pointer to a void function with no prototype.

Thus, somename points to a function that:

takes a pointer to a void function with no prototype, and
returns a pointer to a void function taking a pointer to a void function with no prototype.

If a declaration truly defeats your ability to parse it without aid, one could construct a tree describing it. The C grammar naturally corresponds to a tree, and learning the relevant theory and correspondences between grammar and parsing is a part of a computer science curriculum. This would not be “efficient” for humans as the question asks, but it is a deterministic way to analyze declarations.

score 1 · Accepted Answer · answered Aug 27 '20 at 14:01

The method I've developed is to start with the leftmost identifier and work out, keeping in mind the following precedence rules:

T *a[N];   // a is an array of pointer
T (*a)[N]; // a is a pointer to an array
T *f();    // f is a function returning a pointer
T (*f)();  // if is a pointer to a function

and doing that recursively for any function parameters.

I'm going to use λ to represent unnamed parameters, so we get something like this:

         somename                               -- somename is
        *somename                               -- a pointer to
       (*somename)(           )                 --   a function taking
       (*somename)(       λ   )                 --     unnamed parameter is
       (*somename)(      *λ   )                 --     a pointer to
       (*somename)(     (*λ)())                 --       a function taking unspecified parameters
       (*somename)(void (*λ)())                 --       returning void
      *(*somename)(void (*λ)())                 --   returning a pointer to
     (*(*somename)(void (*λ)()))(           )   --     a function taking
     (*(*somename)(void (*λ)()))(       λ   )   --       unnamed parameter is
     (*(*somename)(void (*λ)()))(      *λ   )   --       a pointer to
     (*(*somename)(void (*λ)()))(     (*λ)())   --         a function taking unspecified parameters
     (*(*somename)(void (*λ)()))(void (*λ)())   --         returning void
void (*(*somename)(void (*λ)()))(void (*λ)());  --     returning void

In English, somename is a pointer to a function that takes a pointer to another function as an argument and returns a pointer yet another function that takes a pointer to a still another function as its argument and returns void.

Types this obnoxious are rare in the wild, but they do pop up occasionally.

A way to efficiently parse function pointer declaration syntax

4 Answers4