4

Is there a facility for the C language that allows run-time struct introspection?

The context is this: I've got a daemon that responds to external events, and for each event we carry around an execution context struct (the "context"). The context is big and messy, and contains references to all sorts of state.

Once the event has been handled, I would like to be able to run the context through a filter, and if it matches some set of criteria, drop a log message to help with debugging. However, since I hope to use this for field debugging, I won't know what criteria will be useful to filter on until run time.

My ideal solution would allow the user to, essentially, write a C-style boolean expression and have the program use that. Something like:

activate_filter context.response_time > 4.2 && context.event.event_type == foo_event

Ideas that have been tossed around so far include:

  • Providing a limited set of fields that we know how to access.
  • Wrapping all the relevant structs in some sort of macro that generates introspection tools at run time.
  • Writing a python script that knows where (versioned) headers live, generates C code and compiles it to a dll, which the daemon then loads and uses as a filter. Obviously this approach has some extra security considerations.

Before I start in on some crazy design goose chase, does anyone know of examples of this sort of thing in the wild? I've dome some googling but haven't come up with much.

Dan
  • 2,952
  • 4
  • 23
  • 29
  • 3
    There is no built-in introspection. You can design systems that approximate introspection (statically defined structures that describe other structures), and the `sizeof` operator and the `offsetof` macro from `` can be a help. The type encoding is a whole separate bag'o'worms. – Jonathan Leffler Jan 09 '15 at 21:52
  • 1
    You might find [Is there a way to print `struct` member in a loop without naming each member in C?](http://stackoverflow.com/questions/27496245/is-there-a-way-to-print-struct-members-in-a-loop-without-naming-each-member-in-c/27497861#27497861) helpful, or you might not. – Jonathan Leffler Jan 09 '15 at 21:55
  • It's possible to do introspection [using `_Generic`](https://stackoverflow.com/a/17290414/975097), but this works at compile-time instead of at runtime. – Anderson Green Jul 13 '21 at 20:18

3 Answers3

2

I would also suggest tackling this issue from another angle. The key words in your question are:

The context is big and messy

And that's where the issue is. Once you clean this up, you'll probably be able to come up with a clean logging facility.

Consider redefining all the fields in your context struct in some easy, pliable format, like XML. A simple `XML schema, that lists all the members of the struct, their types, and maybe some other metadata, even a comment that documents this field.

Then, throw together a quick and dirty stylesheet that reads the XML file and generates a compilable C struct, that your code actually uses. Then, a different stylesheet that cranks out robo-generated code that enumerates each field in the struct, and generates the code to convert each field into a string.

From that, bolting on a logging facility of some kind, with a user-provided filtering string becomes an easier task. You do have to come up with some way of parsing an arbitrary filtering string. Knowledge of lex and yacc would come in handy.

Things of this nature have been done before.

The XCB library is a C client library for the X11 protocol. The protocol defines various kinds of binary messages which are essentially simple structs that the client and the server toss to each other, over a socket. The way that libxcb is implemented, is that all X11 messages and all datatypes inside them are described in an XML definition, and a stylesheet robo-generates C struct definitions, and the code to parse them out, and provide a fairly clean C API to parse and generate X11 messages.

Sam Varshavchik
  • 114,536
  • 5
  • 94
  • 148
0

You are probably approaching this problem from a wrong side.

Logging is typically used to facilitate debugging. The program writes all sorts of events to a log file. To extract interesting entries filtering is applied to the log file.

Sometimes a program generates just too much events; logging libraries usually address this issues by offering verbosity control. Basically a logging function takes an additional parameter telling the verbosity level of the current message. If the value is above the globally configured threshold the message gets discarded. Some libraries even allow to control verbosity level on a per-module basis (Ex: google log).

Another possible approach is to leverage the power of a debugger since the debugger has access to all sorts of meta information. One can create a conditional breakpoint testing variables in scope for arbitrary conditions. Once the program stops any information could be extracted from the scope. This can be automated using scripting facilities provided by a debugger (gdb has great ones).

Finally there are tools generating glue code to use C libraries from scripting languages. One example is SWIG. It analyzes a header file and generates code allowing a scripting language to invoke functions, access structure fields, etc.

Your filter expression will become a program in, say, Lua (other scripting languages are supported as well). You invoke this program passing in the pointer to execution context struct (the "context"). Thanks to the accessors generated by SWIG Lua program can examine any field in the structure.

Nick Zavaritsky
  • 1,429
  • 8
  • 19
  • much better, one tool to learn rather than learning lex and yacc and xml parsing, etc. – user3629249 Jan 09 '15 at 23:28
  • 1
    Thanks for the response! You're right that this is an atypical use of logging. Unfortunately, this program handles (tens of) thousands of events per second, and logging all of them has been shown to have a big performance impact. I've done some experimentation with basic filters and they don't seem to have the same slow-down. I suspect that running in a debugger will have similar performance implications, and moreover we're hoping to use this facility on live servers at customer sites. I'll definitely check out SWIG though. – Dan Jan 12 '15 at 18:54
-1

I generated introspection out of SWIG-CSV parser.

Suppose the C code contains structure like the following,

class Bike {
public:
    int color;      // color of the bike
    int gearCount;      // number of configurable gear
    Bike() {
        // bla bla
    }
    ~Bike() {
        // bla bla
    }
    void operate() {
        // bla bla
    }
};

Then it will generate the following CSV metadata,

Bike|color|int|variable|public|
Bike|gearCount|int|variable|public|
Bike|operate|void|function|public|f().

Now it is easy to parse the CSV file with python or C/C++ if needed.

import csv
with open('bike.csv', 'rb') as csvfile:
    bike_metadata = csv.reader(csvfile, delimiter='|')
    # do your thing
KRoy
  • 1,290
  • 14
  • 10