I need to implement xor between floats in python for huge 2D array data (like thousand row by thousand column matrix). I use the following implementation:
import struct
def fxor(a, b):
rtrn = []
a = struct.pack('d', a)
b = struct.pack('d', b)
for ba, bb in zip(a, b):
rtrn.append(ba ^ bb)
return struct.unpack('d', bytes(rtrn))[0]
print(fxor(5.34, 5.34)) #0.0
print(fxor(10.23, 5.34)) #9.54764402360672e-308
print(fxor(10.23,fxor(10.23, 5.34))) #5.34
The way I use fxor
:
# for demo purpose I took 3 by 2 matrix
mat1 = np.random.random_sample((3, 2))
mat2 = np.random.random_sample((3, 2))
resultant = []
for i in range(3):
row = []
for j in range(2):
row.append(fxor(mat1[i][j],mat2[i][j]))
resultant.append(row)
resultant
Which work perfectly in my case. But when I check the time profile it seems the implementation is very slow for large array (60% of total time).
ncalls tottime percall cumtime percall filename:lineno(function)
250000 1.438 0.000 1.926 0.000 2837056651.py:3(fxor)
.
.
.
500000 0.124 0.000 0.124 0.000 {built-in method _struct.pack}
250000 0.067 0.000 0.067 0.000 {built-in method _struct.unpack}
Is there any optimized way to do this like np.bitwise_xor
does for int
value?
Update
@jasonharper suggest me to use .view(np.int64)
which work nice:
mat1 = np.random.random_sample((3, 2))
mat2 = np.random.random_sample((3, 2))
print(mat1)
mat3 = np.bitwise_xor(mat1.view(np.int64),mat2.view(np.int64))
print(np.bitwise_xor(mat2.view(np.int64),mat3).view(np.float64))
# output
#[[0.71297944 0.33048679]
# [0.82762999 0.26549565]
# [0.94499741 0.2570297 ]]
#[[0.71297944 0.33048679]
# [0.82762999 0.26549565]
# [0.94499741 0.2570297 ]]
But the issue is, sometimes it gives the following error:
ValueError: When changing to a larger dtype, its size must be a divisor of the total size in bytes of the last axis of the array.
How to handle this error?
Update 2
Everything works nice until the array size cross >10000
. Because then I get two error for different different execution. This
ValueError: operands could not be broadcast together with shapes (10000,1250) (10000,10000)
and this.
ValueError: When changing to a larger dtype, its size must be a divisor of the total size in bytes of the last axis of the array.
I can assure you that the dimension of those matrices are same because they are passed through
assert first_mat.shape == second_mat.shape
I was unable to predict the reason because sometimes the program run without any issue and sometimes it raise them for that huge 2D array. If you want to know how I generate those array then here is my another question where I showed how I generate those matrices.
The problem mostly depend on the numpy view
---> 46 return np.bitwise_xor(Matrix.view(np.int64),transformationMatrix.view(np.int64)).view(np.float64)
update 3
@JérômeRichard suggested to check shape of .view()
for both matrices. I was surprise that my mat1 was int
valued matrix which create the issue. I update that to return always float
valued matrix and things are working nice until I got nan
value for some cases.
a = np.array([[4.27666612,4.61512052],[0.19573934,0.82816473]])
b = np.array([[0.97597378,0.09191992],[0.32720493,0.86295611]])
np.bitwise_xor(a.view(np.uint8),b.view(np.uint8)).view(np.float64)
# gives
#array([[ nan, 7.72164724e+306],
# [4.17041859e-308, 1.54832353e-309]])
Which is not feasible for my problem. I was surprise why nan
was return as a result of xor
kind operation. How to handle this infeasibility?
update 4
I still find the np.bitwise_xor
problematic with narray.view(np.uint8)
mode. because it gives the overflow value every times.
# overflow values are
np.finfo(np.double).min, np.finfo(np.double).max
# -1.79769313486e+308, 1.79769313486e+308
Even, its become hard to work with the resultant data. Is there no efficient solution at all?