997

How does this C program work?

main(_){_^448&&main(-~_);putchar(--_%64?32|-~7[__TIME__-_/8%8][">'txiZ^(~z?"-48]>>";;;====~$::199"[_*2&8|_/64]/(_&2?1:8)%8&1:10);}

It compiles as it is (tested on gcc 4.6.3). It prints the time when compiled. On my system:

    !!  !!!!!!              !!  !!!!!!              !!  !!!!!! 
    !!  !!  !!              !!      !!              !!  !!  !! 
    !!  !!  !!              !!      !!              !!  !!  !! 
    !!  !!!!!!    !!        !!      !!    !!        !!  !!!!!! 
    !!      !!              !!      !!              !!  !!  !! 
    !!      !!              !!      !!              !!  !!  !! 
    !!  !!!!!!              !!      !!              !!  !!!!!!

Source: sykes2 - A clock in one line, sykes2 author hints

Some hints: No compile warnings per default. Compiled with -Wall, the following warnings are emitted:

sykes2.c:1:1: warning: return type defaults to ‘int’ [-Wreturn-type]
sykes2.c: In function ‘main’:
sykes2.c:1:14: warning: value computed is not used [-Wunused-value]
sykes2.c:1:1: warning: implicit declaration of function ‘putchar’ [-Wimplicit-function-declaration]
sykes2.c:1:1: warning: suggest parentheses around arithmetic in operand of ‘|’ [-Wparentheses]
sykes2.c:1:1: warning: suggest parentheses around arithmetic in operand of ‘|’ [-Wparentheses]
sykes2.c:1:1: warning: control reaches end of non-void function [-Wreturn-type]
S.S. Anne
  • 15,171
  • 8
  • 38
  • 76
corny
  • 7,824
  • 3
  • 14
  • 20

4 Answers4

1834

Let's de-obfuscate it.

Indenting:

main(_) {
    _^448 && main(-~_);
    putchar(--_%64
        ? 32 | -~7[__TIME__-_/8%8][">'txiZ^(~z?"-48] >> ";;;====~$::199"[_*2&8|_/64]/(_&2?1:8)%8&1
        : 10);
}

Introducing variables to untangle this mess:

main(int i) {
    if(i^448)
        main(-~i);
    if(--i % 64) {
        char a = -~7[__TIME__-i/8%8][">'txiZ^(~z?"-48];
        char b = a >> ";;;====~$::199"[i*2&8|i/64]/(i&2?1:8)%8;
        putchar(32 | (b & 1));
    } else {
        putchar(10); // newline
    }
}

Note that -~i == i+1 because of twos-complement. Therefore, we have

main(int i) {
    if(i != 448)
        main(i+1);
    i--;
    if(i % 64 == 0) {
        putchar('\n');
    } else {
        char a = -~7[__TIME__-i/8%8][">'txiZ^(~z?"-48];
        char b = a >> ";;;====~$::199"[i*2&8|i/64]/(i&2?1:8)%8;
        putchar(32 | (b & 1));
    }
}

Now, note that a[b] is the same as b[a], and apply the -~ == 1+ change again:

main(int i) {
    if(i != 448)
        main(i+1);
    i--;
    if(i % 64 == 0) {
        putchar('\n');
    } else {
        char a = (">'txiZ^(~z?"-48)[(__TIME__-i/8%8)[7]] + 1;
        char b = a >> ";;;====~$::199"[(i*2&8)|i/64]/(i&2?1:8)%8;
        putchar(32 | (b & 1));
    }
}

Converting the recursion to a loop and sneaking in a bit more simplification:

// please don't pass any command-line arguments
main() {
    int i;
    for(i=447; i>=0; i--) {
        if(i % 64 == 0) {
            putchar('\n');
        } else {
            char t = __TIME__[7 - i/8%8];
            char a = ">'txiZ^(~z?"[t - 48] + 1;
            int shift = ";;;====~$::199"[(i*2&8) | (i/64)];
            if((i & 2) == 0)
                shift /= 8;
            shift = shift % 8;
            char b = a >> shift;
            putchar(32 | (b & 1));
        }
    }
}

This outputs one character per iteration. Every 64th character, it outputs a newline. Otherwise, it uses a pair of data tables to figure out what to output, and puts either character 32 (a space) or character 33 (a !). The first table (">'txiZ^(~z?") is a set of 10 bitmaps describing the appearance of each character, and the second table (";;;====~$::199") selects the appropriate bit to display from the bitmap.

The second table

Let's start by examining the second table, int shift = ";;;====~$::199"[(i*2&8) | (i/64)];. i/64 is the line number (6 to 0) and i*2&8 is 8 iff i is 4, 5, 6 or 7 mod 8.

if((i & 2) == 0) shift /= 8; shift = shift % 8 selects either the high octal digit (for i%8 = 0,1,4,5) or the low octal digit (for i%8 = 2,3,6,7) of the table value. The shift table ends up looking like this:

row col val
6   6-7 0
6   4-5 0
6   2-3 5
6   0-1 7
5   6-7 1
5   4-5 7
5   2-3 5
5   0-1 7
4   6-7 1
4   4-5 7
4   2-3 5
4   0-1 7
3   6-7 1
3   4-5 6
3   2-3 5
3   0-1 7
2   6-7 2
2   4-5 7
2   2-3 3
2   0-1 7
1   6-7 2
1   4-5 7
1   2-3 3
1   0-1 7
0   6-7 4
0   4-5 4
0   2-3 3
0   0-1 7

or in tabular form

00005577
11775577
11775577
11665577
22773377
22773377
44443377

Note that the author used the null terminator for the first two table entries (sneaky!).

This is designed after a seven-segment display, with 7s as blanks. So, the entries in the first table must define the segments that get lit up.

The first table

__TIME__ is a special macro defined by the preprocessor. It expands to a string constant containing the time at which the preprocessor was run, in the form "HH:MM:SS". Observe that it contains exactly 8 characters. Note that 0-9 have ASCII values 48 through 57 and : has ASCII value 58. The output is 64 characters per line, so that leaves 8 characters per character of __TIME__.

7 - i/8%8 is thus the index of __TIME__ that is presently being output (the 7- is needed because we are iterating i downwards). So, t is the character of __TIME__ being output.

a ends up equalling the following in binary, depending on the input t:

0 00111111
1 00101000
2 01110101
3 01111001
4 01101010
5 01011011
6 01011111
7 00101001
8 01111111
9 01111011
: 01000000

Each number is a bitmap describing the segments that are lit up in our seven-segment display. Since the characters are all 7-bit ASCII, the high bit is always cleared. Thus, 7 in the segment table always prints as a blank. The second table looks like this with the 7s as blanks:

000055  
11  55  
11  55  
116655  
22  33  
22  33  
444433  

So, for example, 4 is 01101010 (bits 1, 3, 5, and 6 set), which prints as

----!!--
!!--!!--
!!--!!--
!!!!!!--
----!!--
----!!--
----!!--

To show we really understand the code, let's adjust the output a bit with this table:

  00  
11  55
11  55
  66  
22  33
22  33
  44

This is encoded as "?;;?==? '::799\x07". For artistic purposes, we'll add 64 to a few of the characters (since only the low 6 bits are used, this won't affect the output); this gives "?{{?}}?gg::799G" (note that the 8th character is unused, so we can actually make it whatever we want). Putting our new table in the original code:

main(_){_^448&&main(-~_);putchar(--_%64?32|-~7[__TIME__-_/8%8][">'txiZ^(~z?"-48]>>"?{{?}}?gg::799G"[_*2&8|_/64]/(_&2?1:8)%8&1:10);}

we get

          !!              !!                              !!   
    !!  !!              !!  !!  !!  !!              !!  !!  !! 
    !!  !!              !!  !!  !!  !!              !!  !!  !! 
          !!      !!              !!      !!                   
    !!  !!  !!          !!  !!      !!              !!  !!  !! 
    !!  !!  !!          !!  !!      !!              !!  !!  !! 
          !!              !!                              !!   

just as we expected. It's not as solid-looking as the original, which explains why the author chose to use the table he did.

Community
  • 1
  • 1
nneonneo
  • 171,345
  • 36
  • 312
  • 383
  • 2
    @drahnr - Technically it's both a `*` (dereference) and a `+` :P – detly Mar 14 '13 at 00:00
  • 19
    @АртёмЦарионов: Around 30 minutes, but I've been coming back and editing it a fair bit. I use C a lot, and I've done a few IOCCC deobfuscations for personal interest before (the last one I did, just for personal interest, was [this beautiful raytracer](http://www.ioccc.org/2004/gavare.c)). If you want to ask how it works, I'd be happy to oblige ;) – nneonneo Mar 14 '13 at 00:03
  • 5
    @АртёмЦарионов: about a day IIRC (also counts time spent understanding the raytracer geometry). That program is also very clever, because it *uses no keywords*. – nneonneo Mar 14 '13 at 00:34
  • @drahnr : could you please tell more about what it means by [] is just a + in disguise? – user13267 Mar 14 '13 at 01:17
  • 185
    C.. all the power of assembly language combined with the readability of assembly language – wim Mar 14 '13 at 05:24
  • 1
    @detly yes, but the important point is the `+` not the dereference - taken as granted to be known – drahnr Mar 14 '13 at 08:31
  • 2
    @user13267 foo[bar] is implemented by taking the address of foo and adding bar * sizeof(typeStoredInArray) to get the address of the barth element in the array. – Dan Is Fiddling By Firelight Mar 14 '13 at 18:57
  • 6
    For more in this vein, check out “Obfuscated C and Other Mysteries”, by Don Libes. It teaches C techniques by analyzing Obfuscated C Contest entries. – Chris N Apr 13 '13 at 17:11
  • Wow!!! how can you wrap your head around that!? Please explain this, on my machine running win vista, Mingw32 gcc ver 4.6.1, the application only prints system time when the app was compiled, when you run it again without compiling it prints the original system compile time. – gath Apr 24 '13 at 09:07
  • on the first 2 lines, how can `main(_) { _^448 && main(-~_);` becomes `main(int i) { if(i^448) main(-~i);`. I don't see any define here, but even if there is, `_` can't be `int i` and `if(i` at the same time – phuclv Aug 01 '13 at 14:21
  • @nneonneo: so "_" is a valid variable name? How can you use the variable withoud declaring it first. And I still don't know how `_^448 && main(-~_);` becomes `if(i^448) main(-~i)` – phuclv Aug 01 '13 at 14:57
  • @LưuVĩnhPhúc: `main(_) {...}` *implicitly declares* `_` as a parameter of type `int`. For the other, just think about what `&&` is doing. – nneonneo Aug 01 '13 at 22:49
  • 1
    I understand it a bit. But why implicitly declaring `_` automatically assign `_ = 0` too? And I have only seen main without argument or with 2 arguments like `int main ( int argc, char *argv[] )`, how can a 1 argument version be valid? – phuclv Aug 03 '13 at 14:14
  • 1
    @LưuVĩnhPhúc: The first argument to `main` is the number of arguments to the script plus one (usually called `argc`); for this program, it is 1 because the script is started with no arguments. `main` is very special, and can be specified to take different numbers of arguments without causing issues. – nneonneo Aug 03 '13 at 14:17
  • wow, that's great. I've just found an example of 3-argument main [here](http://stackoverflow.com/questions/727930/three-arguments-to-main-and-other-obfuscating-tricks?rq=1) – phuclv Aug 04 '13 at 00:14
  • 2
    @gath: The program prints out the value of `__TIME__`, which is a macro with a constant value -- the time that the app was compiled. It's not intended to print the current time. – cHao Aug 23 '13 at 12:28
  • @nneonneo I am a bit confused at calculation of shift table. For the 6th row shouldn't it be like 00005757 because for col 3 (i=443), we start with shift=61. Now since i is odd, we take the lower octal bits (101=5). But for col 2(i=442), we take higher octal bit (111=7), and likewise for col 1 and 0. Following the similar argument, all other rows should have alternating pattern e.g., row 5 should be 17175757? Am I missing any point? – stillanoob Mar 31 '17 at 06:43
  • 1
    @ibrahim5253: The test isn't to see if `i` is odd - it's `i & 2` which tests the *second* least-significant-bit of `i`. – nneonneo Mar 31 '17 at 14:59
  • @phuclv Four args have also been used. – Pryftan Apr 19 '23 at 18:28
105

Let's format this for easier reading:

main(_){
  _^448&&main(-~_);
  putchar((--_%64) ? (32|-(~7[__TIME__-_/8%8])[">'txiZ^(~z?"-48]>>(";;;====~$::199")[_*2&8|_/64]/(_&2?1:8)%8&1):10);
}

So, running it with no arguments, _ (argc conventionally) is 1. main() will recursively call itself, passing the result of -(~_) (negative bitwise NOT of _), so really it'll go 448 recursions (Only condition where _^448 == 0).

Taking that, it'll print 7 64-character wide lines (the outer ternary condition, and 448/64 == 7). So let's rewrite it a little cleaner:

main(int argc) {
  if (argc^448) main(-(~argc));
  if (argc % 64) {
    putchar((32|-(~7[__TIME__-argc/8%8])[">'txiZ^(~z?"-48]>>(";;;====~$::199")[argc*2&8|argc/64]/(argc&2?1:8)%8&1));
  } else putchar('\n');
}

Now, 32 is decimal for ASCII space. It either prints a space or a '!' (33 is '!', hence the '&1' at the end). Let's focus on the blob in the middle:

-(~(7[__TIME__-argc/8%8][">'txiZ^(~z?"-48]) >>
     (";;;====~$::199"[argc*2&8|argc/64]) / (argc&2?1:8) % 8

As another poster said, __TIME__ is the compile time for the program, and is a string, so there's some string arithmetic going on, as well as taking advantage of an array subscript being bidirectional: a[b] is the same as b[a] for character arrays.

7[__TIME__ - (argc/8)%8]

This will select one of the first 8 characters in __TIME__. This is then indexed into [">'txiZ^(~z?"-48] (0-9 characters are 48-57 decimal). The characters in this string must have been chosen for their ASCII values. This same character ASCII code manipulation continues through the expression, to result in the printing of either a ' ' or '!' depending on the location within the character's glyph.

chmeee
  • 3,608
  • 1
  • 21
  • 28
51

Adding to the other solutions, -~x is equal to x+1 because ~x is equivalent to (0xffffffff-x). This is equal to (-1-x) in 2s complement, so -~x is -(-1-x) = x+1.

nneonneo
  • 171,345
  • 36
  • 312
  • 383
Thomas Song
  • 717
  • 4
  • 7
7

I de-obfuscated the modulo arithmetics as much as I could and removed the reccursion

int pixelX, line, digit ;
for(line=6; line >= 0; line--){
  for (digit =0; digit<8; digit++){
    for(pixelX=7;pixelX > 0; pixelX--){ 
        putchar(' '| 1 + ">'txiZ^(~z?"["12:34:56"[digit]-'0'] >> 
          (";;;====~$::199"[pixel*2 & 8  | line] / (pixelX&2 ? 1 : 8) ) % 8 & 1);               
    }
  }
  putchar('\n');
}

Expanding it a bit more:

int pixelX, line, digit, shift;
char shiftChar;
for(line=6; line >= 0; line--){
    for (digit =0; digit<8; digit++){
        for(pixelX=7;pixelX >= 0; pixelX--){ 
            shiftChar = ";;;====~$::199"[pixelX*2 & 8 | line];
            if (pixelX & 2)
                shift = shiftChar & 7;
            else
                shift = shiftChar >> 3;     
            putchar(' '| (">'txiZ^(~z?"["12:34:56"[digit]-'0'] + 1) >> shift & 1 );
        }

    }
    putchar('\n');
}
Lefteris E
  • 2,806
  • 1
  • 24
  • 23