4

Let's say I'm in this sitation:

main.c :

 #include <stdio.h>
 #include <stdlib.h>

 #include "header.h"

 int iCanProcess (char* gimmeSmthToProcess);

 int processingFunctionsCount = 0;
 int (*(*processingFunctions)) (char*) = NULL;

 int addProcessingFunction(int (*fct)(char*)) {
     processingFunctionsCount++;
     processingFunctions = realloc(processingFunctions, 
               sizeof(int (*)(char*))*ProcessingFunctionsCount);
   processingFunctions[processingFunctionsCount-1] = fct;
 }

 int main(int argc, char *argv[]) {
     char* dataToProcess = "I am some veeeery lenghty data";
     addProcessingFunction(iCanProcess);

     [ ... ] 

     for(unsigned int i = 0; i < processingFunctionsCount; i++) {
         processingFunctions[i](dataToProcess);
     }

     free(processingFunctions);
   return 0;
 }

 int iCanProcess (char* gimmeSmthToProcess) { ... }

somefile.c :

 #include "header.h"

 int aFunction(char* someDataToProcess) { ... }  

header.h :

  #ifndef HEADER_DEF
  #define HEADER_DEF

  extern int processingFunctionsCount;
  extern int (*(*processingFunctions)) (char*);

  int addProcessingFunction(int (*fct)(char*));

  #endif

Is there ANY way, using macros or any other trick, I can add aFunction to the array of pointer-to-functions processingFunctions without changing main.c or header.h every time I need to add one ?

The problem here is not to change the array as it can be reallocated easily, but to NOT change main() function: there must be a way I can be aware of the file being here and compiled, and fetch the function prototype while staying outside of main()

I thought about using a preprocessor trick like this one but don't seem to find a proper way to do it...

(Side-note : This is a trimmed-down version of a bigger project, which in fact is base code to support parsers with the same output but different input. Some parsers support some type of files, so i have an array of function pointers (one for each parser, to check if they are compatible) and I call each one of them against the file contents. Then, I ask the user to chose which parser it wants to use. I have one file per parser, containing a "check" function, to see if the parser can handle this file, and a "parse" function to actually do all the hard work. I can't change the header or the main.c files every time I add a parser. )

(Side-note 2 : this title is terrible... if you have any idea for a better one, please oh PLEASE feel free to edit it and remove this note. Thanks)

Community
  • 1
  • 1
Magix
  • 4,989
  • 7
  • 26
  • 50
  • What Operating System? Portable? – Iharob Al Asimi Jan 10 '16 at 01:11
  • 1
    OS-independant, fully-portable code :) – Magix Jan 10 '16 at 01:11
  • 2
    Interesting question ! Macro tricks hide generally a static compile time table. But this won't work here as you want your function registration decentral. By the way, it's not fully compile time, because you've foreseen it to happen with memory allocation. With C++ it would be a piece of cake, but within standard C (except with build scripts) I can't see how it could be done. – Christophe Jan 10 '16 at 12:59

2 Answers2

4

You could make each function a module (shared object or dll for windows) with a single symbol of a known name, and then at runtime simply scan a directory for the .sos or .dlls load each one and create a pointer to the symbol, suppose you had N modules, where the ith module source code is

module.i.c

int function(char *parameter)
{
     // Do whatever you want here
     return THE_RETURN_VALUE;
}

Then you compile each .c file into a shared object, I will use Linux for illustration on windows you can do a similar thing, and the linux solution works on POSIX systems so it covers a lot

First generate the module.i.c files with this script

#!/bin/bash

for i in {0..100};
do
    cat > module.$i.c <<EOT
#include <stdlib.h>

int
function(char *parameter)
{
    // Deal with parameter
    return $i;
}
EOT
done

Now create a Makefile like this one

CC = gcc
LDFLAGS =
CFLAGS = -Wall -Werror -g3 -O0
FUNCTIONS = $(patsubst %.c,%.so, $(wildcard *.*.c))

all: $(FUNCTIONS)
    $(CC) $(CFLAGS) $(LDFLAGS) main.c -o main -ldl

%.so: %.c
    $(CC) -shared $(CFLAGS) $(LDFLAGS) $< -o $@

clean:
    @rm -fv *.so *.o main

And the program that would load the modules (we assume that they are in the same directory as the executable)

#include <stdlib.h>
#include <dirent.h>
#include <string.h>
#include <stdio.h>
#include <dlfcn.h>

int
main(void)
{
    DIR *dir;
    struct dirent *entry;
    dir = opendir(".");
    if (dir == NULL)
        return -1;
    while ((entry = readdir(dir)) != NULL)
    {
        void *handle;
        char path[PATH_MAX];
        int (*function)(char *);
        if (strstr(entry->d_name, ".so") == NULL)
            continue;
        if (snprintf(path, sizeof(path), "./%s", entry->d_name) >= sizeof(path))
            continue;
        handle = dlopen(path, RTLD_LAZY);
        if (handle == NULL)
            continue; // Better: report the error with `dlerror()'
        function = (int (*)(char *)) dlsym(handle, "function");
        if (function != NULL)
            fprintf(stdout, "function: %d\n", function("example"));
        else
            fprintf(stderr, "symbol-not-found: %s\n", entry->d_name);
        dlclose(handle);
    }
    closedir(dir);
    return 0;
}

On Windows the idea would be the same, although you can't traverse the directory like the code above and you need to use LoadLibrary() instead of dlopen(), and replace the dlsym() with the appropriate function.

But the same idea would work too.

More information on how to secure the modules you load and their folder can be found in this question

Community
  • 1
  • 1
Iharob Al Asimi
  • 52,653
  • 6
  • 59
  • 97
  • This is a very interesting approach, even though it is not really portable and compilation process gets quite messy. If no other solution is found, then this answer will be accepted :) Moreover, there is no need to recompile every time I want to add a parser which is a very good point. – Magix Jan 10 '16 at 01:43
  • I know it's not portable, but writing the Windows code is all it needs. Most systems are POSIX compliant and this will work there. I have done this for an application to load functionality at runtime, writing the Linux and Windows Code was easy. – Iharob Al Asimi Jan 10 '16 at 01:46
  • Wouldn't this cause any security issue where a malware would add a dll/so in the folder with this known symbol in order to get whatever malicious code to be executed ? – Magix Jan 10 '16 at 14:38
  • Yes maybe, but that would be solvable. For example you can have a checksum and try to secure your *plugins* directory. On windows it might be a major concern, but still many windows applications use plugin systems. You can always do things to ensure that you can *trust* a given plugin but that goes way beyond the scope of this question. I recommend you read the *Qt* plugin system, they have a plugin loader class, it's [tag:c++] but the basic idea is there. – Iharob Al Asimi Jan 10 '16 at 14:40
  • Accepted your answer because it gives freedom not to recompile for every additionnal module. Thank you very much :) I may post a follow-up about how to secure the plugins directory if I don't find enough data anywhere else. – Magix Jan 10 '16 at 17:18
  • @MagixKiller Feel free to add it to this very post in order to complete it. – Iharob Al Asimi Jan 10 '16 at 17:23
  • I posted the follow-up and completed your answer :) – Magix Jan 11 '16 at 13:16
1

The preprocessor and standard C aren't going to be much help. The simplest solution is to just generate the boilerplate with scripts.

This can easily be done with fully portable Standard C.

If you put all the processing functions in a directory and perhaps tag them with a comment such as /* PROCESSOR */, then it's simple to grok the necessary proto information with a regex. Perl is nice for this sort of thing:

use strict;

sub emit_header_file {
  my $protos = shift;
  open(F, "> table_protos.h") || die $!;
  print F <<"END";
#ifndef TABLE_PROTOS_H
#define TABLE_PROTOS_H
void addAllProcessingFunctions(void);
void addProcessingFunction(int (*)(char *));
END
  foreach my $proto (@$protos) {
    print F "int $proto->[0](char *$proto->[1]);\n";
  }
print F "#endif\n";
  close F;
}

sub emit_code_file {
  my $protos = shift;
  open(F, "> table_builder.c") || die $!;
  print F <<"END";
#include "table_protos.h"
void addAllProcessingFunctions(void) {
END
  foreach my $proto (@$protos) {
    print F "  addProcessingFunction($proto->[0]);\n";
  }
  print F "}\n";
  close F;
}

sub main {
  my @protos;
  my $dir = $ARGV[0];
  opendir(DIR, $dir) || die $!;
  while (my $fn = readdir(DIR)) {
    next unless $fn =~ /\.c$/;
    local $/;
    open(F, "$dir/$fn") || die "$!: $fn";
    my $s = <F>;
    my @proto = $s =~ m|/\*\s*PROCESSOR\s*\*/\s*int\s*(\w+)\s*\(\s*char\s*\*\s*(\w+)\s*\)|;
    push @protos, \@proto if @proto;
    print STDERR "Failed to find proto in $fn\n" unless @proto;
    close(F);
  }
  closedir(DIR);
  @protos = sort { $a->[0] cmp $b->[0] } @protos;
  emit_header_file(\@protos);
  emit_code_file(\@protos);
}

main;

So if I create a directory called foo and put three processing files there:

/* p1.c */
#include "table_protos.h"

// This is a processor.

/* PROCESSOR */
int procA(char *s) {
  return 0;
}

/* p2.c */
#include "table_protos.h"

/*PROCESSOR*/ int procB (
    char * 
      string_to_parse)
{ return 0; }

And p3.c is similar. I varied the whitespace just to check the regex.

Then we run

perl grok.pl foo

We end up with table_protos.h:

#ifndef TABLE_PROTOS_H
#define TABLE_PROTOS_H
void addAllProcessingFunctions(void);
void addProcessingFunction(int (*)(char *));
int procA(char *s);
int procB(char *string_to_parse);
int procC(char *param);
#endif

and table_builder.c:

#include "table_protos.h"
void addAllProcessingFunctions(void) {
  addProcessingFunction(procA);
  addProcessingFunction(procB);
  addProcessingFunction(procC);
}

You can #include and call these respectively as needed.

Note that you can make static tables of function pointers, which avoids the code in addAllProcessingFunctions. Of course you can generate the static table with a script as well.

Gene
  • 46,253
  • 4
  • 58
  • 96