0

In this question's answer I got the idea of using pandas unique function instead of numpy unique. When looking into the documentation here I discovered that this can only be done for 1D arrays or tuples. As my data has the format:

example = [[25.1, 0.03], [25.1, 0.03], [24.1, 15]]

it would be possible to covert it to tuples and after using the unique function again back to an array. Does someone know a 'better' way to do this? This question might be related, but is dealing with cells. I don't want to use numpy as I have to keep the order in the array the same.

riyansh.legend
  • 117
  • 1
  • 13

2 Answers2

2

You can convert to tuple and the convert to unique list:

list(dict.fromkeys(map(tuple, example)))

Output:

[(25.1, 0.03), (24.1, 15)]
cosmic_inquiry
  • 2,557
  • 11
  • 23
  • Just noting this nearly a thousand times faster than using pandas: `%timeit list(dict.fromkeys(map(tuple, example)))` -> `1.25 µs` `%timeit pd.DataFrame(example).drop_duplicates()` -> `1.49 ms` – cosmic_inquiry Feb 11 '20 at 07:39
  • Well, as this would have been the only reason to use pandas for me, this is definitely the better option. Less dependencies is always good. – riyansh.legend Feb 11 '20 at 08:01
1

If you'd like to use Pandas:
To find the unique pairs in example, use DataFrame instead of Series and then drop_duplicates:

pd.DataFrame(example).drop_duplicates()

      0      1
0  25.1   0.03
2  24.1  15.00

(And .values will give you back a 2-D array.)

andrew_reece
  • 20,390
  • 3
  • 33
  • 58