Unfamiliar C Syntax in the Declaration of Function

Question

Currently looking at some C code that doesn't make any sense to me. What is (elementSize)? How am supposed to pass arguments to this static function? What is the name of this syntax style so I can learn more abour it?

static int torch_Tensor_(elementSize)(lua_State *L)
{
  luaT_pushinteger(L, THStorage_(elementSize)());
  return 1;
}

https://github.com/torch/torch7/blob/master/generic/Tensor.c

This is the file I am trying to understand for reference.

`torch_Tensor_` and `THStorage_` are probably macros. THStorage_ may be a function which returns a function pointer. — nemequ, Nov 17 '18 at 23:11
Taken on its own, it is just a syntax error. You can define a function that returns a pointer to a function (e.g. `signal()` in standard C), but not a function that returns a function (nor a function that returns an array, rather than a pointer). If there are some macros to help (such as `#define torch_Tensor_(x) …` then it might compile, and you need to work out where `elementSize` is defined to know what is passed to the macro as an argument. It looks like you'll need to study the headers provided with Lua. — Jonathan Leffler, Nov 17 '18 at 23:14

melpomene · Answer 1 · 2018-11-17T23:45:31.527

Normally

static int torch_Tensor_(elementSize)(lua_State *L)

would mean torch_Tensor_ is a function that takes a single parameter called elementSize that has no type (?! - syntax error) and returns a function that takes a pointer to lua_State and returns an int. This is blatantly invalid (functions cannot return other functions).

But what's actually going on here is that torch_Tensor_ is defined as a function-like macro, so before the compiler even sees this declaration, torch_Tensor_(elementSize) is replaced by something else.

In https://github.com/torch/torch7/blob/master/Tensor.c there is

#include "general.h"

#define torch_Storage_(NAME) TH_CONCAT_4(torch_,Real,Storage_,NAME)
#define torch_Storage TH_CONCAT_STRING_3(torch.,Real,Storage)
#define torch_Tensor_(NAME) TH_CONCAT_4(torch_,Real,Tensor_,NAME)
#define torch_Tensor TH_CONCAT_STRING_3(torch.,Real,Tensor)

#include "generic/Tensor.c"
#include "THGenerateAllTypes.h"

#include "generic/Tensor.c"
#include "THGenerateHalfType.h"

with TH_CONCAT_... defined in lib/TH/THGeneral.h.in:

#define TH_CONCAT_STRING_3(x,y,z) TH_CONCAT_STRING_3_EXPAND(x,y,z)
#define TH_CONCAT_STRING_3_EXPAND(x,y,z) #x #y #z

#define TH_CONCAT_4_EXPAND(x,y,z,w) x ## y ## z ## w
#define TH_CONCAT_4(x,y,z,w) TH_CONCAT_4_EXPAND(x,y,z,w)

So torch_Tensor_ is defined as a macro before generic/Tensor.c is included.

torch_Tensor_(elementSize)

expands to

TH_CONCAT_4(torch_,Real,Tensor_,elementSize)

which expands to

TH_CONCAT_4_EXPAND(torch_,...,Tensor_,elementSize)

... is a placeholder, not real code. Real is defined as a macro in the various THGenerate*Type.h files, so this line actually becomes

TH_CONCAT_4_EXPAND(torch_,char,Tensor_,elementSize)
TH_CONCAT_4_EXPAND(torch_,int,Tensor_,elementSize)
TH_CONCAT_4_EXPAND(torch_,float,Tensor_,elementSize)
...

depending on context. Anyway, the end result is a single identifier of the form

torch_charTensor_elementSize
torch_intTensor_elementSize
torch_floatTensor_elementSize
...

(one token).

The resulting function definition thus looks like e.g.

static int torch_charTensor_elementSize(lua_State *L)
{
    ...
}

depending on which context generic/Tensor.c was included in.

The reason things are done this way is to have what amounts to the same code, but for multiple different types. In C++ you would write a function template:

namespace torch {
    template<typename Real>
    static int Tensor_elementSize(lua_State *L) { ... }
}

But C has no templates (nor namespaces), so the only way to get "generic" code like this is to do it manually with macros and preprocessing tricks (and manually "decorating" names; e.g. the elementSize function for floats is really called torch_floatTensor_elementSize).

All we're really trying to do is abstract over a type parameter, here called Real.

I'm guessing that file is compiled multiple times, with `elementSize` defined to different values. For example, if you defined it to 8, you would end up with `torch_RealTensor_8()`. — nemequ, Nov 17 '18 at 23:44
@nemequ No, the parameter is `Real`. `elementSize` is the name of the function. — melpomene, Nov 17 '18 at 23:45
Thank you so much for the guidance! This helps a ton and I'll be going over it carefully to make sure I understand everything. — btomtom5, Nov 19 '18 at 00:41
Wow, I finally sat down and digested the guide you wrote. It finally makes a ton of sense now. The only thing I am still left confused about is how the compiler knows that Real is a macro defined variable not Tensor_ or elementSize in the following piece of code TH_CONCAT_4(torch_,Real,Tensor_,elementSize)? — btomtom5, Nov 23 '18 at 22:17
@btomtom5 The preprocessor simply performs macro expansion on all identifiers. In principle all of those names could be `#define`d as something else, but a quick search of the codebase shows that only `#define Real ...` appears in some files, so that's what the preprocessor replaces. Of course I'm assuming that `generic/Tensor.c` is only included in places where `#define Real ...` is active; otherwise it would remain as is. — melpomene, Nov 23 '18 at 22:26

Unfamiliar C Syntax in the Declaration of Function

1 Answers1