This is a relative question of the post How to extract rows from an numpy array based on the content?, and I used the following code to split rows based on the content in the column:
np.split(sorted_a,np.unique(sorted_a[:,1],return_index=True)[1][1:])
the code worked fine, but later I tried the code to split other cases (as below), I found that there could be wrong results (as showed in CASE#1).
CASE#1
[[2748309, 246211, 1],
[2748309, 246211, 2],
[2747481, 246201, 54]]
OUTPUT#1
[]
[[2748309, 246211, 1],
[2748309, 246211, 2],
[2747481, 246201, 54]]
the result I want
[[2748309, 246211, 1],
[2748309, 246211, 2]]
[[2747481, 246201, 54]]
I think the code may successfully split rows only in the case with little numbers, which with less digits, and I don't know how to solve problems showed in CASE#1 above. So in this post, I have 2 little relative questions:
1. How to split rows with greater numbers in it? (as showed in CASE #1)?
2. How to handle (split) data with both cases including #1 rows with the same element in the second column, but different in the first, and #2 rows with the same element in the first column, but different in the second ? (That is, could python distinguish rows considering contents in both first and second columns simultaneously?)
Feel free to give me suggestions, thank you.
Update#1
The ravel_multi_index
function could handle this kind of task with integer-arrays, but how to deal with arrays containing float?