L = [('key1', 14), ('key2', 20), ('key3', 13), ('key4', 10), ('key5', 11)]
Assume I got a list
like the above. The the second element of each tuple conceptually represents the size of a block memory. These block memories are adjacent and sequential. Their relationship could be represented as :
0____14____14+20____14+20+13____14+20+13+10____14+20+13+10+11
Now the user will give me the 'absolute address' and I should return a key to him to decrypt the whole memory block. For instance, if he wants to access the address '10', then I should return key1 for him. If he wants to access '15', and then I should return key2.
My implementation now has to search from the beginning every time. In reality, there are many memory blocks and users want to access all the time. The performance is bad.
L = [('key1', 14), ('key2', 20), ('key3', 13), ('key4', 10), ('key5', 11)]
def get_key(address):
c = 0
offset = 0
for i in L:
if address < i[1] + offset:
break
c += 1
offset += i[1]
return c
print(get_key(15))
How could I improve the performance?
The data structure doesn't have to be a
list
(L). But the known things should be (1) the block size, (2) the key, (3)user will access the 'absolute address'.
As per Burhan Khalid's instruction, the final code is as follows (see also How to find the cumulative sum of numbers in a list?):
from bisect import bisect
from itertools import accumulate
L = [('key1', 14), ('key2', 20), ('key3', 13), ('key4', 10), ('key5', 11)]
markers = [i[0] for i in L]
boundaries = list(accumulate([i[1] for i in L]))
##offsets = []
##offsets.append(boundaries[0])
##for offset in boundaries[1:]:
## offsets.append(sum(offsets)+offset)
def buckets(value, boundaries, markers):
i = bisect(boundaries, value)
return markers[i]
print(buckets(67, boundaries, markers))