14

Is there any way that I can discover the type of a variable automatically in C, either through some mechanism within the program itself, or--more likely--through a pre-compilation script that uses the compiler's passes up to the point where it has parsed the variables and assigned them their types? I'm looking for general suggestions about this. Below is more background about what I need and why.

I would like to change the semantics of the OpenMP reduction clause. At this point, it seems easiest simply to replace the clause in the source code (through a script) with a call to a function, and then I can define the function to implement the reduction semantics I want. For instance, my script would convert this

#pragma omp parallel for reduction(+:x)

into this:

my_reduction(PLUS, &x, sizeof(x));
#pragma omp parallel for

where, earlier, I have (say)

enum reduction_op {PLUS, MINUS, TIMES, AND,
  OR, BIT_AND, BIT_OR, BIT_XOR, /* ... */};

And my_reduction has signature

void my_reduction(enum reduction_op op, void * var, size_t size);

Among other things, my_reduction would have to apply the addition operation to the reduction variable as the programmer had originally intended. But my function cannot know how to do this correctly. In particular, although it knows the kind of operation (PLUS), the location of the original variable (var), and the size of the variable's type, it does not know the variable's type itself. In particular, it does not know whether var has an integral or floating-point type. From a low-level POV, the addition operation for those two classes of types is completely different.

If only the nonstandard operator typeof, which GCC supports, would work the way sizeof works--returning some sort of type variable--I could solve this problem easily. But typeof is not really like sizeof: it can only be used, apparently, in l-value declarations.

Now, the compiler obviously does know the type of x before it finishes generating the executable code. This leads me to wonder whether I can somehow leverage GCC's parser, just to get x's type and pass it to my script, and then run GCC again, all the way, to compile my altered source code. It would then be simple enough to declare

enum var_type { INT8, UINT8, INT16, UINT16, /* ,..., */ FLOAT, DOUBLE};
void my_reduction(enum reduction_op op, void * var, enum var_type vtype);

And my_reduction can cast appropriately before dereferencing and applying the operator.

As you can see, I am trying to create a kind of "dispatching" mechanism in C. Why not just use C++ overloading? Because my project constrains me to work with legacy source code written in C. I can alter the code automatically with a script, but I cannot rewrite it into a different language.

Thanks!

Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
Amittai Aviram
  • 2,270
  • 3
  • 25
  • 32
  • 2
    How about post-processing your source code with some tools/scripts? E.g. parse it with clang, find the types, insert/adjust type-specific code, compile? – Alexey Frunze Feb 06 '12 at 08:37
  • Thanks, Alex. That sounds like it's going in the right direction. – Amittai Aviram Feb 06 '12 at 13:02
  • 1
    I read somewhere that user defined reduction will be part of the 3.1 or 4.0 standard. Hm 3.1 says: `reduction({operator|intrinsic_procedure_name}:list)` ..never tried intrinsic_procedure_name. allthough it only partly solves your type detection. – Bort Feb 06 '12 at 14:20
  • 1
    nvm. `intrinsic_procedure_name` refers to something else. – Bort Feb 06 '12 at 14:26

6 Answers6

10

C11 _Generic

Not a direct solution, but it does allow you to achieve the desired result if you are patient to code all types as in:

#include <assert.h>
#include <string.h>

#define typename(x) _Generic((x), \
    int:     "int", \
    float:   "float", \
    default: "other")

int main(void) {
    int i;
    float f;
    void* v;
    assert(strcmp(typename(i), "int")   == 0);
    assert(strcmp(typename(f), "float") == 0);
    assert(strcmp(typename(v), "other") == 0);
}

Compile and run with:

gcc -std=c11 a.c
./a.out

A good starting point with tons of types can be found in this answer.

Tested in Ubuntu 17.10, GCC 7.2.0. GCC only added support in 4.9.

Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
5

You can use sizeof function to determine type , let the variable of unknown type be var. then

if(sizeof(var)==sizeof(char))
        printf("char");
    else if(sizeof(var)==sizeof(int))
        printf("int");
    else if(sizeof(var)==sizeof(double))
        printf("double");

Thou it will led to complications when two or more primary types might have same size .

parth_07
  • 1,322
  • 16
  • 22
3

GCC provides the typeof extension. It is not standard, but common enough (several other compilers, e.g. clang/llvm, have it).

You could perhaps consider customizing GCC by extending it with MELT (a domain specific language to extend GCC) to fit your purposes.

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • 1
    Thanks, Basile. As you can see from my posting, I am aware of typeof, but typeof will not give me what I need: it does not return anything; you cannot pass it or its information to a function. You can do this: #define add(x,y,z) { \ typeof(x) _x = x; \ typeof(y) _y = y; \ z = _x + _y; \ } The typeof operator seems to me to have very limited usefulness. – Amittai Aviram Feb 06 '12 at 06:21
  • 1
    It isn't standard and won't work on most C compilers. In my experience it is a very bad idea to use GCC extensions for production code. – Lundin Feb 06 '12 at 07:39
3

C doesn't really have a way to perform this at pre-compile time, unless you write a flood of macros. I would not recommend the flood of macros approach, it would basically go like this:

void int_reduction (enum reduction_op op, void * var, size_t size);

#define reduction(type,op,var,size) type##_reduction(op, var, size)

...
reduction(int, PLUS, &x, sizeof(x)); // function call

Note that this is very bad practice and should only be used as last resort when maintaining poorly written legacy code, if even then. There is no type safety or other such guarantees with this approach.

A safer approach is to explicitly call int_reduction() from the caller, or to call a generic function which decides the type in runtime:

void reduction (enum type, enum reduction_op op, void * var, size_t size)
{
  switch(type)
  {
    case INT_TYPE:
      int_reduction(op, var, size);
      break;
    ...
  }
} 

If int_reduction is inlined and various other optimizations are done, this runtime evaluation isn't necessarily that much slower than the obfuscated macros, but it is far safer.

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • 1
    Yes, thanks for your thoughts. But how do you find out the type? That's my big challenge. The rest is just syntax--and of course I agree with your comments about the best syntax. In the above case, for instance, how do I know that a given variable is INT_TYPE? The script I am writing to move "reduction(+:x)" up to a separate call "reduction(INT_TYPE, PLUS, &var, sizeof(x));" would have to know that x has type INT_TYPE, but how?--short of my writing an entire parser just to get the type of x? – Amittai Aviram Feb 06 '12 at 12:59
  • 1
    @AmittaiAviram How do you know when to declare a variable as `int`? I have no idea of the nature of your data or where it is coming from, so I can't answer that question. If it is some sort of external raw data, then you will naturally have to determine the data type before doing any calculation. And that is an algorithm issue which has nothing to do with C programming syntax as such. – Lundin Feb 06 '12 at 13:51
  • 1
    @Lundin--No, no. The picture is more like this. Suppose you have a `main` function that has the declaration `int x` near the top. Say 100 lines down from that, you have `#pragma omp parallel for reduction (+:x)`. The `x` there has type `int` because that's how it was declared in the current scope. The programmer knows this and could annotate a function call himself, but I want to be able to do this automatically. – Amittai Aviram Feb 07 '12 at 00:08
  • The source code already exists, with the variable declarations and the OpenMP pragmas. What I want to do is automatically refactor it to replace the `reduction` clause with a function call of the form `reduction(enum reduction_op, void * var, enum type, size_t size)` (or something similar). My problem is that the form of the `reduction` clause does not, itself, have any representation of the type of the variable, since--of course--the compiler is already presumed to know the type when it gets to that clause. – Amittai Aviram Feb 07 '12 at 00:11
  • @AmittaiAviram You will either have to declare the original data type as a struct, with the enum type as first member, then pass a pointer to this struct to a function. Or keep the original variable but include the type in the call to the function/macro. There are no other alternatives in standard C. – Lundin Feb 07 '12 at 07:29
2

You could also consider customizing GCC with a plugin or a MELT extension for your needs. However, this requires understanding some of GCC internal representations (Gimple, Tree) which are complex (so will take you days of work at least).

But types are a compile-only thing in C. They are not reified.

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • 1
    This is more or less what I wound up doing--not a plug-in, but some code hacking within GCC itself. I figured out how to get the type information from the Gimple tree_nodes at compile time. It did take quite a bit of digging. Thanks! – Amittai Aviram Apr 24 '12 at 04:14
0

In general it is not possible to identify what kind of data is in a given byte or sequence of bytes. For example, the 0 byte could be an empty string or the integer 0. the bit pattern for 99 could be that number, or the letter 'c'.

The following is a bit of hackery to turn an arbitrary sequence of bytes into a printable value. It works in most cases (but not for numbers that could also be characters). It is for the lcc compiler under Windows 7, with 32-bit ints, longs and 64-bit doubles.

char* OclAnyToString(void* x)
{ char* ss = (char*) x;
  int ind = 0;

  int* ix = (int*) x;
  long* lx = (long*) x; 
  double* dx = (double*) x; 

  char* sbufi = (char*) calloc(21, sizeof(char)); 
  char* sbufl = (char*) calloc(21, sizeof(char)); 
  char* sbufd = (char*) calloc(21, sizeof(char)); 

  if (ss[0] == '\0')
  { sprintf(sbufi, "%d", *ix);
    sprintf(sbufd, "%f", *dx);
    if (strcmp(sbufi,"0") == 0 && 
        strcmp(sbufd,"0.000000") == 0)
    { return "0"; }
    else if (strcmp(sbufd,"0.000000") != 0)
    { return sbufd; }
    else 
    { return sbufi; }
  }

  while (isprint(ss[ind]) && 0 < ss[ind] && ss[ind] < 128 && ind < 1024)
  { /* printf("%d\n", ss[ind]); */   
    ind++; 
  }

  if (ss[ind] == '\0')
  { return (char*) x; } 

  sprintf(sbufi, "%d", *ix);
  sprintf(sbufl, "%ld", *lx);
  sprintf(sbufd, "%f", *dx);

  if (strcmp(sbufd,"0.000000") != 0)
  { free(sbufi); 
    free(sbufl); 
  
    return sbufd;
  } 

  if (strcmp(sbufi,sbufl) == 0)
  { free(sbufd); 
    free(sbufl); 
    return sbufi; 
  }
  else 
  { free(sbufd); 
    free(sbufi); 
    return sbufl; 
  }
}
Kevin Lano
  • 81
  • 2