30

So the April 1, 2013 xkcd Externalities web comic features a Skein 1024 1024 hash breaking contest. I'm assuming that this must be nothing more than a brute force effort where random strings are hashed in an effort to match Randall's posted hash? Is this correct?

Also, my knowledge of Skein hashing theory is virtually non-existent but being a halfway decent programmer I was able to download and run both SkeinFish (C#) and Maarten Bodewes Skein implementation (Java) locally in 1024 1024 mode with some input strings. The hashes that they gave, however, were different than the hash that xkcd returned for the same input. This may be an extremely naive question but do different Skein implementations give different hashes? And what Skein implementation is xkcd using?

Thanks for pardoning my ignorance!

Tom
  • 3,006
  • 6
  • 33
  • 54
  • I have the same questions. I found this script which someone wrote for this contest. It generates random input and hashes it. I am walking through the code right now trying to understand it. Hopefully someone answers your question! https://github.com/dghubble/xkcd-hashing/blob/master/main.py – Jefferson Hudson Apr 02 '13 at 16:37
  • 4
    @JeffersonHudson OMG brute force. I was wondering what was this burning CPU smell this morning. While 36,000$ were donated to Wikipedia, electric companies got a 200,000$ income increase. – gawi Apr 02 '13 at 16:58
  • The perl and Haskell versions yield different results too. Maybe that's part of the April's fool joke. – gawi Apr 02 '13 at 16:59
  • Haha I noticed that too! I realized I hadn't checked to see if the output of the script was the same as the xkcd hash (it isn't).. I think the best way to do it would be to make a script that randomly generates ascii strings then does a POST and lets the xkcd hash function handle the hashing – Jefferson Hudson Apr 02 '13 at 17:53
  • 2
    @JeffersonHudson http://pythonhosted.org/pyskein/ is working for me. POSTing guesses is most definitely not the way to do it, unless you are counting on getting VERY lucky. – adotout Apr 02 '13 at 19:22
  • Why? Is there some way to analyze your hashes so as to narrow the search? I thought the whole endeavor was one of luck – Jefferson Hudson Apr 02 '13 at 19:53
  • @JeffersonHudson: I'd rather say the whole endeavor is one of raw processing power, since you can improve your chance by generating *many* hashes. A good CPU implementation can generate and check ~2e6 Hashes a second. The participants from our school checked around 5e13 hashes to get a good score. I doubt that the xkcd web server can handle such a rate of requests. – Niklas B. Apr 03 '13 at 09:48
  • The contest appears to have ended? – Brian Cain Apr 03 '13 at 14:27
  • Yes, the contest ended at 12:00am EST, April 3 when a new comic was posted. – Tom Apr 03 '13 at 15:38
  • @Tom -- so no one but Randall knows what the input to the hash was? – Brian Cain Apr 03 '13 at 18:12
  • @BrianCain That's correct. Unless he posted it somewhere after the contest ended. – Tom Apr 03 '13 at 22:13
  • 1
    The one from Maarten Bodewes is hopelessly out of date as he was too busy answering questions on Stackoverflow. Sorry about that :) – Maarten Bodewes Apr 08 '13 at 22:44
  • @MaartenBodewes We missed you and your answers in the JavaCard tagged questions. :( – Ebrahim Ghasemi May 27 '15 at 10:35

3 Answers3

11

There are several different iterations of the skein algorithm. XKCD is using version 1.3, which is also the most recent. Sources can be found here (look for "V1.3")

Interestingly enough, this brute-force method is the same one employed by Bitcoin to "mine" bitcoins. The big differences are the hash algorithm (SHA-256 in that case) and the target hash (which is dynamically determined to be any hash starting with a certain number of zeros.) It takes a lot of work to discover the hash, but once it has been found it is trivial to verify the source bits and that the resulting hash meets the criteria.

fbrereto
  • 35,429
  • 19
  • 126
  • 178
  • SkeinFish is version 1.3 but still gives a different result. – Tom Apr 02 '13 at 23:19
  • 3
    @Tom I used SkeinFish for some of the cracking, and it worked fine as long as you initialize it with the right magic value:`var skein = new Skein1024(); skein.Initialize(SkeinInitializationType.Normal);` – cobbal Apr 06 '13 at 00:22
7

Here's the source code the Stanford team used. We ran this on about a hundred 8-core EC2 servers for a while, but not the whole competition.

https://github.com/jhiesey/skeincrack

Justin L.
  • 3,957
  • 3
  • 31
  • 29
1

If you were hashing non-alphanumeric characters (spaces, punctuation, etc.), you may have been getting different results due to HTML form encoding. The "enctype" attribute on the form XKCD was hosting was "application/octet-stream", which according to https://developer.mozilla.org/en-US/docs/HTML/Element/form is not a browser-supported standard. I assume the browser falls back on the URL-encoding type when it sees one it doesn't recognize.

I observed the string "=" being submitted URL-encoded in Chrome, and returning a different hash than what I got locally with the latest pyskein. But when I submitted it with this curl command line (no longer works), I got the expected hash:

curl -X POST --data-binary "hashable==" "http://almamater.xkcd.com/?edu=school.edu"

The Stanford code in another answer does the same thing, and they apparently had some success. I never got any random data to locally hash to a better score than even my own school, so I never got a chance to test thoroughly how to pass arbitrary data in properly. I don't know what the exact behavior was (e.g., perhaps if you omitted hashable= the server would detect that and just hash the whole POST body), but it may have intentionally been a little tricky as part of April Fool's.

Joey Hewitt
  • 120
  • 1
  • 7