In Kademlia, all (key, value) pairs stored by the node, aside from the ones that were originally published by the current node itself, have an expiration time based on where the current node is located in relation to the key. If the current node is of the k-closest nodes to the key, the (key, value) pair expires when it outlives the 24h from when it was originally published. If it's not a k-closest node, the expiration time is
inversely proportional to the number of nodes between the current node and the node whose ID is closest to the key ID
as per Kademlia paper. The paper also says that
this number can be inferred from the bucket structure of the current node.
There seem to be two very different ways to count nodes between the current node and the given node, and I'm not sure which one is right. We assume flat array routing table implementation next, an array with 160 buckets pre-allocated.
Xlattice Kademlia Design Specification page says that you should find a bucket index j into which the given node would fall into and count the nodes in buckets 0..j, counting all nodes in 0..j-1 and counting only nodes that are closer to the current node than the key in the final bucket j.
"Implementation of the Kademlia Distributed Hash Table" semester thesis by Bruno Spori (section "Calculation of the Number of Intermediate Nodes") calculates the distance between the current node and the given node, and counts nodes only in the buckets that have the distance less or equal to the distance between the current node and the given node.
Both those methods seem right to me, but they are completely different and yield different results. The first method counts nodes between the current node and the given node based on the distance between the current node and other nodes in our buckets. The second method counts nodes between the current node and the given node based on the distance between the given node and other nodes in our buckets.
For example, if the current node has ID 0001_0100
(let's assume 8-bit ids for the sake of the example), there are only 8 buckets that contain nodes with the following prefixes:
0001_0101
, 0001_011x
, 0001_00xx
, 0001_1xxx
, 0000_xxxx
, 001x_xxxx
, 01xx_xxxx
, 1xxx_xxxx
.
Let's say we want to calculate expiration time for a key 1010_0010
.
The distance between the current node and the given node is 182 (0001_0100 xor 1010_0010 = 182
).
Using the first method, we would count all nodes in buckets 0..6 plus the nodes in the bucket 7 that are closer to the current node than the given ID.
This works because the distances between the current node and all the buckets are: 1, 2, 4, 12, 20, 52, 84, 148.
You can find them by xoring our ID with the range the bucket covers (I replaced x with 0 to get the smallest, but not necessarily the closest, ID that would fall into that bucket), e.g. 0001_0100 xor 0001_0101 = 1
and 0001_0100 xor 1000_0000 = 148
.
All nodes up to the last one would have nodes that are <= 182 (the distance between the current node and the given ID) away from the current node. The last bucket can have nodes that are further away.
So we count the number of nodes in all 8 buckets (with partially counting the last one).
Using the second method, we count all nodes in buckets 1, 2, 4, 5 and 7. We do not count buckets 0, 3, 6.
This works because the distances between the given ID and the buckets are: 183, 180, 178, 186, 162, 130, 226, 34.
You can find them by xoring the given ID with the range the bucket covers (I replaced x with 0 to get the smallest, but not necessarily the closest, ID that would fall into that bucket), e.g. 1010_0010 xor 0001_0101 = 183
and 1010_0010 xor 1000_0000 = 34
.
Only buckets 1, 2, 4, 5 and 7 have nodes with a distance less than 182 (the distance between the current node and the given ID) with respect to the given ID.
So we count nodes only in 5 buckets out of 8.
Counting nodes in 8/8 buckets and 5/8 buckets is a big difference. Which method is the right one? Both seem to count the number of nodes between the current node and the given key, but the result is so different.
Note that the xor metric holds here, there doesn't seem to be any mistake made here. For example, the distance between the current node and a node residing the last bucket is 0001_0100 xor 1000_0000 = 148
. The distance between the given node and the same node in the last bucket is 1010_0010 xor 1000_0000 = 34
. 148 xor 34 = 182
, so d(a, b) = d(a, c) xor d(c, b)
holds.
The first method seems to count all nodes the current node knows of that are closer than 182 from the current node. The second method seems to count all nodes the current node knows of that are closer than 182 from the given node.
I'm thinking that the second method is more correct, as we want to find if we are a k-closest node to the given key. When finding nodes close to a given ID, i.e. the FIND_NODE
RPC, you do so using a process similar to the second method to identify which buckets contain nodes closest to the given ID, e.g. in the given example those would be buckets 7, 5, 4, 2, 1, 0, 3, 6 - in that exact order, with the closest first.
But then again, the first method also makes sense as we know the best about our own surrounding. We know the whole 8 buckets of nodes closer than 182 to the current node, while we know only about 5 buckets of nodes closer than 182 to the given key.