347

Our (PHP) framework sometimes renders hidden inputs with value YTowOnt9. I can't find that string anywhere in the (huge) codebase, and can't figure out where it came from. I decided to Google for that particular string, and the result surprised me. Over half a million - kind of random - hits. I haven't found any page describing the value itself. It has 0 hits on Stack Overflow.

Is YTowOnt9 some kind of magic string?

Dharman
  • 30,962
  • 25
  • 85
  • 135
Sherlock
  • 7,525
  • 6
  • 38
  • 79
  • 2
    Always the same value? If it were random, I would say it could be a CSRF token or something like that. – Platinum Azure Apr 22 '14 at 15:06
  • Always the same value; this exact same value has 500.000 hits on Google. – Sherlock Apr 22 '14 at 15:06
  • It looks like a salt or token for something. Is it always the same string? Even if you logout and delete cookies/cache or use another browser? – Jurik Apr 22 '14 at 15:07
  • What PHP framework are you using? – j08691 Apr 22 '14 at 15:07
  • 3
    It's a custom framework, and please note the fact that this string occurs hundreds of thousands of times on Google. – Sherlock Apr 22 '14 at 15:07
  • `...renders hidden inputs...` PHP isn't the only potential culprit, have you searched your JS files? Are you using a templating engine? If so then check the `.tpl` files. – MonkeyZeus Apr 22 '14 at 15:16
  • Does the hidden field have any particular `name` attribute? usually hidden fields can be used to submit data, but you would need a name to access the value server side – musefan Apr 22 '14 at 15:57

1 Answers1

563

It seems to be a PHP-serialized empty array, base 64 encoded.

$ base64 -D <<< 'YTowOnt9'
a:0:{}
$ php -r 'var_dump(unserialize(base64_decode("YTowOnt9")));'
array(0) {
}

There are many scripts that serialize arrays of data. When the arrays have data, they vary greatly, so the Base64 encoded PHP-serialized values do too, but when they are empty they are all the same. It makes it look as if a lot of very different PHP scripts have this random string in common.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
kojiro
  • 74,557
  • 19
  • 143
  • 201
  • 25
    `YTowOnt9` = `a:0:{}` – Tim S. Apr 22 '14 at 18:35
  • 2
    Very nice finding, this will also help me debugging a bug I was trying to fix. ;) – Sherlock Apr 23 '14 at 05:44
  • 42
    @kojiro how on earth did you get to this answer? did you just think "oh, i'll just try to deserialize it in base64, i get this feeling that'll be it!" ? please elaborate !:) – Thousand Apr 23 '14 at 06:47
  • 4
    @Thousand He probably thought that "since this doesn't make sense I'll try decoding it with base64" and figures out if was `a:0:{}` which is very clearly a serialized array. – h2ooooooo Apr 23 '14 at 09:41
  • 102
    I stared at it for a while and tried to rearrange the letters in my head. Then I suddenly realized the odd capitalization reminded me of base 64. So I tried it and got lucky. – kojiro Apr 23 '14 at 11:17
  • 1
    If you've ever had to crawl an ExpressionEngine database, you'll immediately recognise YTowOnt9 as some kind of base64! Good find! – stef Apr 23 '14 at 14:03
  • 2
    I'm impressed, but shouldn't this be `base64 -d`? Not that it really matters for this question :-) – Adrian Frühwirth Apr 23 '14 at 14:21
  • 10
    @AdrianFrühwirth GNU's `base64` uses `-d` to mean decode, so in your case, probably yes. The answer's author is most likely on OS X, which uses `-D` for decode. Portability is hard. :-) – Thanatos Apr 23 '14 at 16:38
  • 2
    Can someone explain to me why "base 64" encoding is used? Are the alternatives base 32/10; is this just some really low-level stuff? – HC_ Apr 23 '14 at 17:44
  • @HC_: The value needs to be passed in url-encoded form, so it better be text and base64 is the most efficient common encoding to text. But then the serialized data actually are text already, so it sounds more like a lame attempt at obfuscation. – Jan Hudec Apr 23 '14 at 18:25
  • 6
    @JanHudec: base64 will only contain printable characters. A serialized PHP array might not, it can contain zero bytes and such. – knittl Apr 23 '14 at 18:49
  • @knittl: It would still be more efficient to encode those odd bits with % codes as they are rare. – Jan Hudec Apr 23 '14 at 19:02
  • @JanHudec: well, that totally depends on your input data. For php serialized arrays it's probably cheaper (size wise) to use rawurlencode, but then again, it does not really matter … – knittl Apr 23 '14 at 19:12
  • 16
    @kojiro, I'm not sure it makes sense to refer to base64 as "compressions" (not even "very poor compression"), given that the output text is consistently 33% **bigger** than the input. – tobyink Apr 24 '14 at 22:31
  • @tobyink you're right. I think at the time I was thinking relative to other escaping/encoding schemes. – kojiro Apr 24 '14 at 23:51
  • 1
    @HC_ And there you have it: Base 64 is the largest power-of-two numeric base that can be represented using only printable (ascii) characters. The larger the base, the more data can be represented in a single character. – kojiro Apr 26 '14 at 15:35
  • 2
    Doesn't it seem curious too, that YTowOnt9 seems a pronounceable stylized contraction for some possible expression or joke (like ID-ten-T) or typo of something, akin to "pwn". A politician Leet-speek expression for "Your town won't mind". Or Y2009, the year of many events. Then again, I bet many bizarre meanings can be intuited from garbled base64 encodings. :) – Tom Pace Apr 26 '14 at 17:36
  • I was playing with base64 encoding a few months ago. I think it's at somewhere around 200 characters input size that you end up with an equal size base64 encoded. Longer than that, and your output will actually be smaller than your input. – Buttle Butkus Apr 27 '14 at 04:27
  • 5
    @ButtleButkus Many of us would like to see examples of any strings of bytes that are smaller after base64 encoding. – user2338816 Apr 27 '14 at 09:50
  • @ButtleButkus were you, perhaps, compressing the text and base64 encoding the compression? That would have something like you're describing: overhead for small values, a breakeven point, and then it's better after that. – Tim S. Jul 16 '14 at 18:32
  • @user2338816 I was compressing the string. Is there a reason not to use compression when you can? – Buttle Butkus Jul 19 '14 at 02:17
  • 1
    @ButtleButkus Well, if the string is compressed, it kind of removes the whole issue of `base64` from the discussion. Any significant string is almost guaranteed not to have `base64` characteristics after compression (until correctly decompressed). – user2338816 Jul 19 '14 at 03:00