12

I'm trying to produce a timing attack in PHP and am using PHP 7.1 with the following script:

<?php
    $find = "hello";
    $length = array_combine(range(1, 10), array_fill(1, 10, 0));
    for ($i = 0; $i < 1000000; $i++) {
        for ($j = 1; $j <= 10; $j++) {
            $testValue = str_repeat('a', $j);
            $start = microtime(true);
            if ($find === $testValue) {
                // Do nothing
            }
            $end = microtime(true);
            $length[$j] += $end - $start;
        }
    }

    arsort($length);
    $length = key($length);
    var_dump($length . " found");

    $found = '';
    $alphabet = array_combine(range('a', 'z'), array_fill(1, 26, 0));
    for ($len = 0; $len < $length; $len++) {
        $currentIteration = $alphabet;
        $filler = str_repeat('a', $length - $len - 1);
        for ($i = 0; $i < 1000000; $i++) {
            foreach ($currentIteration as $letter => $time) {
                $testValue = $found . $letter . $filler;
                $start = microtime(true);
                if ($find === $testValue) {
                    // Do nothing
                }
                $end = microtime(true);
                $currentIteration[$letter] += $end - $start;
            }
        }
        arsort($currentIteration);
        $found .= key($currentIteration);
    }
    var_dump($found);

This is searching for a word with the following constraints

  • a-z only
  • up to 10 characters

The script finds the length of the word without any issue, but the value of the word never comes back as expected with a timing attack.

Is there something I am doing wrong?

The script loops though lengths, correctly identifies the length. It then loops though each letter (a-z) and checks the speed on these. In theory, 'haaaa' should be slightly slower than 'aaaaa' due to the first letter being a h. It then carries on for each of the five letters.

Running gives something like 'brhas' which is clearly wrong (it's different each time, but always wrong).

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
exussum
  • 18,275
  • 8
  • 32
  • 65
  • Can you clarify where you're expecting the result ? – Martin Jan 14 '18 at 19:31
  • The last result "var_dump($found);" should show the "Hello" - I will update the question – exussum Jan 14 '18 at 19:35
  • Running your code produces inconsistent results, it ejects 5 random letters. Is this the error you're getting? – Martin Jan 14 '18 at 19:37
  • why use `1000000` ? – Martin Jan 14 '18 at 19:37
  • Just to choose a fairly large amount, it could be smaller i didnt test it runs fast enough on my machine, yes 5 random letters means it has found the correct length (5) but not the specific letters. The random letters are the slowest letters found by mircotime – exussum Jan 14 '18 at 19:40
  • If I understand what you're doing correctly then I doubt you can do this with a timing attack if comparing different characters takes the exact same time (which I think it does) – apokryfos Jan 14 '18 at 19:41
  • Found is appended to `$found .= key($currentIteration);` after ordering the array by most time taken. The implementation loops though letter by letter. http://php.net/manual/en/function.hash-equals.php should be used for a timing attack same version (instead of ===) – exussum Jan 14 '18 at 19:43
  • hash_equals is timing attack safe however that does not mean that `===` is always timing attack unsafe. I'm pretty sure that under certain circumstances `===` may timing attack safe and you've may hit those circumstances in your second loop. – apokryfos Jan 14 '18 at 20:02
  • Sometimes I got the expected results for your script, sometimes not. I changed it to accept arguments `$find = $argv[1];` and ran this on the command line two times: `for s in hello marcell stack overflow; do php php-timing-attack.php $s; done`. I got the following results: `string(7) "9 found"` and `string(9) "zgmdykrbk"`, ***`string(7) "7 found"`*** and ***`string(7) "marcell"`***, ***`string(7) "5 found"` and `string(5) "stack"`***, `string(7) "5 found"` and `string(5) "uusov"` (1st). – marcell Jan 14 '18 at 23:35
  • And for the second run: ***`string(7) "5 found"`*** and ***`string(5) "hello"`, `string(7) "7 found"`*** and ***`string(7) "marcell"`***, ***`string(7) "5 found"` and `string(5) "stack"`***, ***`string(7) "8 found"`*** and ***`string(8) "overflow"`***. – marcell Jan 14 '18 at 23:37
  • @marcell can you post the code for that ? I get no where close – exussum Jan 15 '18 at 11:49
  • @exussum I just modified the second line: `$find = "hello";` to `$find = $argv[1];`. Here is the proof: the md5 hash of your original file `a75273828aee0c34668faa592c0a76ca` and after the mentioned modification: `ecf5b5e18fab444fa7748cc8379dfbce`. I am on `macOs Sierra 10.12.6`, php: `PHP 7.0.26 (cli)`. Do you need other info? – marcell Jan 15 '18 at 13:02
  • Interesting. Im running linux and I get nothing close to the string. I will try another OS. Thank you! – exussum Jan 15 '18 at 17:40
  • Does the code have to be that way, can changes be made? Or you rather figure out first what's up? – Pyr James Jan 24 '18 at 17:24
  • Changes can be made sure. If you can get a version working I will be extremely interested – exussum Jan 24 '18 at 17:55

1 Answers1

3

Is there something I am doing wrong?

I don't think so. I tried your code and I too, like you and the other people who tried in the comments, get completely random results for the second loop. The first one (the length) is mostly reliable, though not 100% of the times. By the way, the $argv[1] trick suggested didn't really improve the consistency of the results, and honestly I don't really see why it should.

Since I was curious I had a look at the PHP 7.1 source code. The string identity function (zend_is_identical) looks like this:

    case IS_STRING:
        return (Z_STR_P(op1) == Z_STR_P(op2) ||
            (Z_STRLEN_P(op1) == Z_STRLEN_P(op2) &&
             memcmp(Z_STRVAL_P(op1), Z_STRVAL_P(op2), Z_STRLEN_P(op1)) == 0));

Now it's easy to see why the first timing attack on the length works great. If the length is different then memcmp is never called and therefore it returns a lot faster. The difference is easily noticeable, even without too many iterations.

Once you have the length figured out, in your second loop you are basically trying to attack the underlying memcmp. The problem is that the difference in timing highly depends on:

  1. the implementation of memcmp
  2. the current load and interfering processes
  3. the architecture of the machine.

I recommend this article titled "Benchmarking memcmp for timing attacks" for more detailed explanations. They did a much more precise benchmark and still were not able to get a clear noticeable difference in timing. I'm simply going to quote the conclusion of the article:

In conclusion, it highly depends on the circumstances if a memcmp() is subject to a timing attack.

rlanvin
  • 6,057
  • 2
  • 18
  • 24
  • Just to add a little bit to this answer, not totally sure it might be helpful, php speed would probably increase due to its bytecode system. So it probably cache the code, so sometimes depending on what extensions you have installed, your speed might get faster. And as we know the hype of php7 is its increased speed. – Ezekiel Jan 23 '18 at 15:06