Let's consider a 2d-array A
2 3 5 7
2 3 5 7
1 7 1 4
5 8 6 0
2 3 5 7
The first, second and last lines are identical. The algorithm I'm looking for should return an 2d-array with only one of the identical lines, and the number of identical lines for each line in the resulting 2d-array. I use an inefficient naive algorithm to do that:
import numpy
A=numpy.array([[2, 3, 5, 7],[2, 3, 5, 7],[1, 7, 1, 4],[5, 8, 6, 0],[2, 3, 5, 7]])
i=0
end = len(A)
while i<end:
print i,
j=i+1
numberID = 1
while j<end:
print j
if numpy.array_equal(A[i,:] ,A[j,:]):
A=numpy.delete(A,j,axis=0)
end-=1
numberID+=1
else:
j+=1
i+=1
print A, len(A)
Expected result:
array([[2, 3, 5, 7],
[1, 7, 1, 4],
[5, 8, 6, 0]]) # 2d-array freed from identical lines
array([3,1,1]) # number identical arrays per line
This algo looks like using python native within numpy so inefficient. Thanks for help.