108

Refering to js0n.c

The code syntax is as below:

    static void *gostruct[] =
    {
        [0 ... 255] = &&l_bad,
        ['\t'] = &&l_loop, [' '] = &&l_loop, ['\r'] = &&l_loop, ['\n'] = &&l_loop,
        ['"'] = &&l_qup,
        [':'] = &&l_loop, [','] = &&l_loop,
        ['['] = &&l_up, [']'] = &&l_down, // tracking [] and {} individually would allow fuller validation but is really messy
        ['{'] = &&l_up, ['}'] = &&l_down,
        ['-'] = &&l_bare, [48 ... 57] = &&l_bare, // 0-9
        [65 ... 90] = &&l_bare, // A-Z
        [97 ... 122] = &&l_bare // a-z
    };

........
.......

l_bad:
    *vlen = cur - json; // where error'd
    return 0;

........
........

Can anyone explain what is being done here? What does syntax [0 ... 255] and &&l_bad do here?

dandan78
  • 13,328
  • 13
  • 64
  • 78
Gaurav K
  • 2,864
  • 9
  • 39
  • 68

1 Answers1

109

... is an extension provided by GCC

https://gcc.gnu.org/onlinedocs/gcc/Designated-Inits.html#Designated-Inits

To initialize a range of elements to the same value, write [first ... last] = value. This is a GNU extension. For example,

 int widths[] = { [0 ... 9] = 1, [10 ... 99] = 2, [100] = 3 };

&& is another extension

https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html#Labels-as-Values

You can get the address of a label defined in the current function (or a containing function) with the unary operator &&. The value has type void *. This value is a constant and can be used wherever a constant of that type is valid. For example:

 void *ptr;
 /* ... */
 ptr = &&foo;
user657267
  • 20,568
  • 5
  • 58
  • 77
  • 22
    putting it all together that code is making a jump table that uses ascii values for indices, presumably for a parser. – ratchet freak May 22 '15 at 09:29
  • 1
    Specifically a JSON parser, so far as I can tell. – Kevin May 22 '15 at 14:08
  • So, since labels and variables have separate namespaces, how does the parser distinguish `&&foo` meaning "address of the label `foo`" from `&&foo` meaning "address of the address of the variable `foo`"? (Taking the address of an address produces, I believe, undefined behavior, but it _is_ syntactically valid.) – dodgethesteamroller May 22 '15 at 15:26
  • The address-of operator (&) cannot be applied to an rvalue, which is what is returned by the address-of operator. – Kevin M May 22 '15 at 15:50
  • In GCC, you get an error along the lines of "label expected: i" since it assumes unary && is address-of-label. In MSVC, it rightly assumes unary && is a syntax error. If you do `&(&i)` in either, you get an error along the lines of "expected: l-value" – Kevin M May 22 '15 at 16:01
  • 1
    @KevinM that makes sense. When did it become a syntax error to apply the address-of operator (&) to an rvalue? I'm guessing in C99, perhaps? The last time I used Visual C++ regularly was around 1998, which would have been the pre-C99 ANSI standard, and the compiler allowed it then (I know because I remember a typo of a doubled-up `&` getting into production code!). – dodgethesteamroller May 22 '15 at 16:22
  • 3
    @dodgethesteamroller `&&` is an entirely separate token from `&`, so there's no way the standard C grammar could interpret `&&x` as "address of address of x" regardless of the value category of `&x`. – Tavian Barnes May 22 '15 at 16:48
  • @TavianBarnes You're confusing the standard with the implementation thereof. How do you know how the lexing and parsing stages worked in Microsoft's C compiler circa 1998? (I'll save you the trouble: You can't, unless you worked at MS then.) Your answer is analogous to saying that prefix `--` and unary `-` are two different tokens, so there's no need for the parser to disambiguate them--clearly a circular argument. – dodgethesteamroller May 22 '15 at 17:10
  • @dodgethesteamroller You asked whether this changed in C99. I just wanted to indicate that it did not; `&&x` has never parsed that way by any version of the standard. (Of course, MSVC can and does do whatever it wants.) – Tavian Barnes May 22 '15 at 17:14
  • 4
    @dodgethesteamroller: `--` is always parsed as `--`, and `&&` is always parsed as `&&`. C99 §6.4¶4: _the next preprocessing token is the longest sequence of characters that could constitute a preprocessing token_ – ninjalj May 22 '15 at 18:40
  • @dodgethesteamroller -- MSVC likely was treating && as a unary "address-of-address-of-x" operator (weird, but unsurprising) – LThode May 22 '15 at 18:42
  • This code would be utterly broken using a C++11 compiler : "..." and "&&" are now valid and official operators, but in completely different context. – Errata May 27 '15 at 08:33