4

I'm trying to better-grasp Kademlia's XOR distance metric so I've written a small dummy program to try and understand better. I'm also not using a 160-bit number as my key here, but rather a sha256 hash of some user identifier.

Here's my xor distance function. Is this more or less correct? I'm XORing each byte– appending that to a buffer rawBytes and converting that byte buffer into an integer.

func XorDistance(node string, otherNode string) uint64 {
    var rawBytes [32]byte
    for i := 0; i < 32; i++ {
        rawBytes[i] = node[i] ^ otherNode[i]
    }
    distance, _ := binary.Uvarint(rawBytes[:])
    return distance
}
aroooo
  • 4,726
  • 8
  • 47
  • 81

1 Answers1

2

It's not correct because

You have to use the math/big package for usage like that. Here is my revised version of your snippet:

func xorDistance(node string, otherNode string) *big.Int {
    var rawBytes [32]byte
    for i := 0; i < 32; i++ {
        rawBytes[i] = node[i] ^ otherNode[i]
    }
    return big.NewInt(0).SetBytes(rawBytes[:])
}
Chang Qian
  • 1,091
  • 8
  • 15
  • Interesting– UvarInt() was actually returning a value when the two identifiers were different. Was this probably just using the first 64 bits truncated then? – aroooo Nov 06 '18 at 18:43
  • 1
    @arooo I did some more research and found that `Uvarint()` (and its signed sibling) use a unique encoding described in https://developers.google.com/protocol-buffers/docs/encoding (also in the source of `encoding/binary` package), which puts the least significant bits first and 7 bits of data (instead of 8) per byte. So it will give unexpected results trying to interpret arbitrary data in byte slices. Answer updated. – Chang Qian Nov 07 '18 at 02:28
  • 1
    @arooo Also, it's recommended that you add check for length of `node` and `otherNode` and a corresponding return value of type `error` to avoid `index out of range` panics. – Chang Qian Nov 07 '18 at 02:33