using unique function in pandas for a 2D array

Question

In this question's answer I got the idea of using pandas unique function instead of numpy unique. When looking into the documentation here I discovered that this can only be done for 1D arrays or tuples. As my data has the format:

example = [[25.1, 0.03], [25.1, 0.03], [24.1, 15]]

it would be possible to covert it to tuples and after using the unique function again back to an array. Does someone know a 'better' way to do this? This question might be related, but is dealing with cells. I don't want to use numpy as I have to keep the order in the array the same.

score 2 · Accepted Answer · answered Feb 11 '20 at 07:26

2

You can convert to tuple and the convert to unique list:

list(dict.fromkeys(map(tuple, example)))

Output:

[(25.1, 0.03), (24.1, 15)]

answered Feb 11 '20 at 07:26

cosmic_inquiry

2,557
11
23

Just noting this nearly a thousand times faster than using pandas: `%timeit list(dict.fromkeys(map(tuple, example)))` -> `1.25 µs` `%timeit pd.DataFrame(example).drop_duplicates()` -> `1.49 ms` – cosmic_inquiry Feb 11 '20 at 07:39
Well, as this would have been the only reason to use pandas for me, this is definitely the better option. Less dependencies is always good. – riyansh.legend Feb 11 '20 at 08:01

score 1 · Answer 2 · answered Feb 11 '20 at 07:23

If you'd like to use Pandas:
To find the unique pairs in example, use DataFrame instead of Series and then drop_duplicates:

pd.DataFrame(example).drop_duplicates()

      0      1
0  25.1   0.03
2  24.1  15.00

(And .values will give you back a 2-D array.)

using unique function in pandas for a 2D array

2 Answers2