15

Looking at Go standard library, there's a ConstantTimeByteEq function that looks like this:

func ConstantTimeByteEq(x, y uint8) int {
    z := ^(x ^ y)
    z &= z >> 4
    z &= z >> 2
    z &= z >> 1

    return int(z)
}

Now, I understand the need for constant time string (array, etc.) comparison, as a regular algorithm could short-circuit after the first unequal element. But in this case, isn't a regular comparison of two fixed-sized integers a constant time operation at the CPU level already?

justinas
  • 6,287
  • 3
  • 26
  • 36
  • 1
    Possibly related to: http://stackoverflow.com/questions/17603487/how-does-constanttimebyteeq-work – dyoo Aug 22 '13 at 01:01

3 Answers3

11

The point is likely to avoid branch mispredictions, in addition to having the result as 1 or 0 instead of true or false (allowing follow ups as bitwise operations).

Compare how this compiles:

var a, b, c, d byte
_ =  a == b && c == d

=>

0017 (foo.go:15) MOVQ    $0,BX
0018 (foo.go:15) MOVQ    $0,DX
0019 (foo.go:15) MOVQ    $0,CX
0020 (foo.go:15) MOVQ    $0,AX
0021 (foo.go:16) JMP     ,24
0022 (foo.go:16) MOVQ    $1,AX
0023 (foo.go:16) JMP     ,30
0024 (foo.go:16) CMPB    BX,DX
0025 (foo.go:16) JNE     ,29
0026 (foo.go:16) CMPB    CX,AX
0027 (foo.go:16) JNE     ,29
0028 (foo.go:16) JMP     ,22
0029 (foo.go:16) MOVQ    $0,AX

With this:

var a, b, c, d byte
_ =  subtle.ConstantTimeByteEq(a, b) & subtle.ConstantTimeByteEq(c, d)

=>

0018 (foo.go:15) MOVQ    $0,DX
0019 (foo.go:15) MOVQ    $0,AX
0020 (foo.go:15) MOVQ    $0,DI
0021 (foo.go:15) MOVQ    $0,SI
0022 (foo.go:16) XORQ    AX,DX
0023 (foo.go:16) XORQ    $-1,DX
0024 (foo.go:16) MOVQ    DX,BX
0025 (foo.go:16) SHRB    $4,BX
0026 (foo.go:16) ANDQ    BX,DX
0027 (foo.go:16) MOVQ    DX,BX
0028 (foo.go:16) SHRB    $2,BX
0029 (foo.go:16) ANDQ    BX,DX
0030 (foo.go:16) MOVQ    DX,AX
0031 (foo.go:16) SHRB    $1,DX
0032 (foo.go:16) ANDQ    DX,AX
0033 (foo.go:16) MOVBQZX AX,DX
0034 (foo.go:16) MOVQ    DI,BX
0035 (foo.go:16) XORQ    SI,BX
0036 (foo.go:16) XORQ    $-1,BX
0037 (foo.go:16) MOVQ    BX,AX
0038 (foo.go:16) SHRB    $4,BX
0039 (foo.go:16) ANDQ    BX,AX
0040 (foo.go:16) MOVQ    AX,BX
0041 (foo.go:16) SHRB    $2,BX
0042 (foo.go:16) ANDQ    BX,AX
0043 (foo.go:16) MOVQ    AX,BX
0044 (foo.go:16) SHRB    $1,BX
0045 (foo.go:16) ANDQ    BX,AX
0046 (foo.go:16) MOVBQZX AX,BX

Although the latter version is longer, it's also linear -- there are no branches.

Gustavo Niemeyer
  • 22,007
  • 5
  • 57
  • 46
  • This is a "what", but where is the answer to the OP question - "why"? – zzzz Aug 22 '13 at 06:36
  • 1
    The answer to "why" is [here](http://stackoverflow.com/a/17603874/532430): _This function tries to make it so that all calls take the same time regardless of the values of its inputs. This way, an attacker can't use timing based attacks._ – thwd Aug 22 '13 at 12:15
  • 1
    No, the question is actually _"isn't a regular comparison of two fixed-sized integers a constant time operation at the CPU level already",_ literally. – Gustavo Niemeyer Aug 22 '13 at 13:51
  • @GustavoNiemeyer: Your answer _does not_ answer that question either. Anyway, @ tomwilde is right as the OP question is _"Why do we need a constant time *single byte* comparison function?"_. – zzzz Aug 23 '13 at 13:20
7

Not necessarily. And it is hard to tell what the compiler will emit after doing its optimizations. You could end up with different machine code for the highlevel "compare one byte". Leaking just a tiny bit in a side channel might change your encryption from "basically unbreakable" to "hopefully not worth the money needed for a crack".

Volker
  • 40,468
  • 7
  • 81
  • 87
0

If the code which called the function were to immediately branch based upon the result, the use of the constant-time method wouldn't provide much extra security. On the other hand, if one were to call the function on a bunch of different pairs of bytes, keeping a running total of the results, and only branch based upon the final result, then an outside snooper might be able to determine whether that last branch was taken, but wouldn't know which of the previous byte comparisons was responsible for it.

That having been said, I'm not sure I see a whole lot of advantages in most usage cases for going through the trouble of distilling the method's output to a zero or one; simply keeping a running tally of notEqual = (A0 ^ B0); notEqual |= (A1 ^ B1); notEqual |= (A2 ^ B2); ... would achieve the same effect and be much faster.

supercat
  • 77,689
  • 9
  • 166
  • 211