I have two dictonaries A and B and both have the same keys a, b and value. All 3 values of behind those keys are numpy arrays of the same size, but the size may differ between A and B. If found this link here but it is only for onedimensional keys: One can see a combination a(0),b(0) as coordinates in cartesian space and value(0) as their value. And i have two datasets A and B. As an example:
A = {'a': numpy.array([1, 1, 9, 9]),
'b': numpy.array([0, 1, 0, 1]),
'value': numpy.array([1, 2, 3, 4])}
B = {'a': numpy.array([1, 1, 7, 7]),
'b': numpy.array([0, 1, 0, 1]),
'value': numpy.array([101, 102, 1003, 1004])}
I need to sum the values of those dictionarys, if both keys are the same, otherwise i want to append the keys and the values. In the example: Both dictionaries share the key combination a:1 and b:0, as well as a:1 and b:1. Their values are added up 1+101=102 and 2+102=104. The key combination a:9, b:0 and a:9, b:1 are only in dictionary A The key combination a:7, b:0 and a:7, b:1 are only in dictionary B So I want this result
C = {'a': numpy.array([1, 1, 9, 9, 7, 7]),
'b': numpy.array([0, 1, 0, 1, 0, 1]),
'value': numpy.array([102, 104, 3, 4, 1003, 1004 ])}
I came up with a solution, which takes dictionary A and modifies it by adding or appending something from dictionary B. Therefore it first generates one-dimensional hash keys of those two-dimensional key combinations in A and one for those in B. Then uses numpy.intersect() to find the common keys in both dictionaries and adds the values of B to the values of A at that indices. Afterwards i take the invert of the intersection and append both the uncommon keys and the value to dictionary A.
def example(A, B):
# generate hash keys (32 bit shift because values in a and b are larger than in example)
hash_A = map(lambda a, b: (int(a) << 32) + int(b), A['a'], A['b'])
hash_B = map(lambda a, b: (int(a) << 32) + int(b), B['a'], B['b'])
# intersection is now 1-dimensional and easy
intersect = numpy.intersect1d(hash_A, hash_B)
# common keys
A['value'][numpy.in1d(hash_A, intersect)] += B['value'][numpy.in1d(hash_B, intersect)]
# keys only in B and not in A
only_in_B = numpy.in1d(hash_B, intersect, invert=True)
if any(only_in_B):
A['a'] = numpy.append(A['a'], B['a'][only_in_B])
A['value'] = numpy.append(A['value'], B['value'][only_in_B])
A['b'] = numpy.append(A['b'], B['b'][only_in_B])
return A
But my solution seems too slow to be useful and I cannot think of a quicker way of getting there. The numpy.arrays used have millions of entries, and this is done for several combinations of dictionaries. speed is an issue. Any help would be appreciated.