1

How can I make a C macro that takes a list of words separated by a space and split them up?

I want to have a macro such as DECLARE() below.

#define EXPAND(xy) /* expand sequence of words separated by space into 2 */

#define DECLARE(xy) DECLARE_2(EXPAND(xy)) /* x and y are separated by a space, each should go to each argument of DECLARE_2 */

#define DECLARE_2(const, type) char *type##_str = #type; const type

So that I can do:

typedef struct MyStruct { int value; } MyStruct;
DECLARE(const MyStruct) x = { 2 };
print(MyStruct_str); // prints 'MyStruct'
aganm
  • 1,245
  • 1
  • 11
  • 30

3 Answers3

3

How can I make a C macro that takes a list of words separated by a space and split them up?

You cannot, it least not in any general way with only the preprocessing facilities defined by standard C. Individual arguments to function-like preprocessor macros can be sequences of multiple (preprocessing) tokens, as you describe, but macros can do only these things with their arguments:

  • insert them, in full, into other token sequences (including indirectly by passing them as arguments to other macros)
  • convert them to strings
  • concatenate the first and/or last with other tokens

There are some interesting things you can do with the last of those in combination with the automatic rescanning of macro expansions, but they do not get you where you want to go in any general way.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
3

Using macros you can make it work in a very specific circumstance: you have a list of expected values for y (the last token).

In this example our list of expected last tokens will be hello, lovely and world, but you can trivially change and expand them for your purposes.

#define hello_SEPARATE ,hello
#define lovely_SEPARATE ,lovely
#define world_SEPARATE ,world

#define SEPARATE(xy) xy##_SEPARATE

These are some example outputs:

SEPARATE(hello world)        // hello ,world
SEPARATE(world hello)        // world ,hello
SEPARATE(hello lovely)       // hello ,lovely
SEPARATE(goodbye lovely)     // goodbye ,lovely
SEPARATE(I love you world)   // I love you ,world

SEPARATE(hello)              // ,hello
SEPARATE(goodbye)            // goodbye_SEPARATE
SEPARATE(lovely goodbye)     // lovely goodbye_SEPARATE

As shown by the last example, it will not work as intended if the last token is unexpected.


We can further improve the solution above by treating those erratic behaviours that happened on failure.

You may have noticed in the last examples the weird addition of a comma at the start (even when there's only one keyword present) and "_SEPARATE" at the end. This is a consequence of how the crudely the macro is defined.

To fix it, we can make use of the fact that:

  1. The C preprocessor will only trigger a function-like macro if it's followed by parenthesis.
  2. We can detect commas (,) in variadic macro arguments, by shifting the parameters in a variadic macro (similar to this answer on counting VA_ARGS, and explained in this blog post).
  3. We can make pseudo-if statements, using fact 2 and token concatenation (##).
  4. We can detect if a variadic macro has no arguments, using 1, 2 and 3 (as explained in this blog post by Jens Gustedt).

Here's the final solution:

// Boilerplate
#define EXPAND(x) x

#define _GLUE(X,Y) X##Y
#define GLUE(X,Y) _GLUE(X,Y)

#define _ARG_100(_,\
   _100,_99,_98,_97,_96,_95,_94,_93,_92,_91,_90,_89,_88,_87,_86,_85,_84,_83,_82,_81, \
   _80,_79,_78,_77,_76,_75,_74,_73,_72,_71,_70,_69,_68,_67,_66,_65,_64,_63,_62,_61, \
   _60,_59,_58,_57,_56,_55,_54,_53,_52,_51,_50,_49,_48,_47,_46,_45,_44,_43,_42,_41, \
   _40,_39,_38,_37,_36,_35,_34,_33,_32,_31,_30,_29,_28,_27,_26,_25,_24,_23,_22,_21, \
   _20,_19,_18,_17,_16,_15,_14,_13,_12,_11,_10,_9,_8,_7,_6,_5,_4,_3,_2,X_,...) X_
#define HAS_COMMA(...) EXPAND(_ARG_100(__VA_ARGS__, \
   1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, \
   1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, \
   1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, \
   1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 ,1, \
   1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0))

#define _TRIGGER_PARENTHESIS_(...) ,
#define _PASTE5(_0, _1, _2, _3, _4) _0 ## _1 ## _2 ## _3 ## _4
#define _IS_EMPTY_CASE_0001 ,
#define _IS_EMPTY(_0, _1, _2, _3) HAS_COMMA(_PASTE5(_IS_EMPTY_CASE_, _0, _1, _2, _3))
#define IS_EMPTY(...)  \
   _IS_EMPTY(                                                               \
      HAS_COMMA(__VA_ARGS__),                                       \
      HAS_COMMA(_TRIGGER_PARENTHESIS_ __VA_ARGS__),                 \
      HAS_COMMA(__VA_ARGS__ (/*empty*/)),                           \
      HAS_COMMA(_TRIGGER_PARENTHESIS_ __VA_ARGS__ (/*empty*/))      \
   )

// Place where you define your expected last tokens
#define hello_REMOVE
#define lovely_REMOVE
#define world_REMOVE

#define hello_SEPARATE ,hello
#define lovely_SEPARATE ,lovely
#define world_SEPARATE ,world

// SEPARATE Macro
#define _IS_EMPTY_TRIGGERED_0(xy) xy##_SEPARATE
#define _IS_EMPTY_TRIGGERED_1(xy) xy

#define _KEYWORD_TRIGGERED_0(xy) xy
#define _KEYWORD_TRIGGERED_1(xy) GLUE(_IS_EMPTY_TRIGGERED_, IS_EMPTY(xy##_REMOVE))(xy)

#define SEPARATE(xy) GLUE(_KEYWORD_TRIGGERED_, HAS_COMMA(xy##_SEPARATE()))(xy)

And the list of examples:

SEPARATE(hello world)        // hello ,world
SEPARATE(world hello)        // world ,hello
SEPARATE(hello lovely)       // hello ,lovely
SEPARATE(goodbye lovely)     // goodbye ,lovely
SEPARATE(I love you world)   // I love you ,world

SEPARATE(hello)              // hello
SEPARATE(goodbye)            // goodbye
SEPARATE(lovely goodbye)     // lovely goodbye

As you can notice, the behavior of the macro in case of failure is much better: it simply returns back what you put in.

This behavior is also customizable. Changing the definition of "_IS_EMPTY_TRIGGERED_1" (only one keyword detected case) and/or "_KEYWORD_TRIGGERED_0" (no keyword detected case) will change the behavior in cases of failure. As an example, if you want to make the output some error message, you can substitute those lines for:

//...
#define _IS_EMPTY_TRIGGERED_1(xy) ERROR_1

#define _KEYWORD_TRIGGERED_0(xy) ERROR_2
//...

And this will be the result:

SEPARATE(hello world)        // hello ,world
SEPARATE(world hello)        // world ,hello
SEPARATE(hello lovely)       // hello ,lovely
SEPARATE(goodbye lovely)     // goodbye ,lovely
SEPARATE(I love you world)   // I love you ,world

SEPARATE(hello)              // ERROR_1
SEPARATE(goodbye)            // ERROR_2
SEPARATE(lovely goodbye)     // ERROR_2

Note: The EXPAND right after HAS_COMMA in the final version is an extra step of expansion to compensate for the fact that MSVC does not expand __VA_ARGS__ like most other compilers.

Luiz Martins
  • 1,644
  • 10
  • 24
0

Macro in C is replaced in the pre-processor, so if you'll write a funcyion that does so, you can use the macro for "sugar syntax", but a macro itself provides no functionality. You may refer to this answer for the spacing issue: https://stackoverflow.com/a/50000111/14273548

Uriya Harpeness
  • 628
  • 1
  • 6
  • 21