1

I am more or less new to python/numpy and I have this problem:

I have numpy arrays in which the first and last tuples are always the same. In between, there are sometimes duplicate tuples (only the ones directly next to each other) that I want to get rid of. The used parenthesis structure should be maintained.

I tried np.unique already (e.g. 1, 2), but it changes my original order (which has to be maintained). My sample array looks like this:

    myarray = np.array([[[1,1],[1,1],[4,4],[4,4],[2,2],[3,3],[1,1]]])

I need a result that looks like this:

    myarray = np.array([[[1,1],[4,4],[2,2],[3,3],[1,1]]])

Thank you in advance for your support!

jpp
  • 159,742
  • 34
  • 281
  • 339
ccurioso
  • 21
  • 3
  • i think your question is already answered here: https://stackoverflow.com/questions/31097247/remove-duplicate-rows-of-a-numpy-array – Van De Wack Jun 08 '18 at 08:40
  • OP needs to preserve original order, which `np.unique()` will not do. – jedwards Jun 08 '18 at 08:41
  • 1
    @VanDeWack `In between, there are sometimes duplicate tuples (only the **ones directly next to each other)**`. – Divakar Jun 08 '18 at 08:41
  • 2
    Possible duplicate of [numpy.unique with order preserved](https://stackoverflow.com/questions/15637336/numpy-unique-with-order-preserved) – SBylemans Jun 08 '18 at 08:42
  • 1
    @SBylemans Again, the linked Q&A doesn't deal with **ones directly next to each other)** criteria. – Divakar Jun 08 '18 at 08:48
  • @Divakar true, I flagged it as such regarding his comment `I tried np.unique already, but it changes my original order (which has to be maintained).` – SBylemans Jun 08 '18 at 08:55

1 Answers1

2

Get the one-off offsetted comparisons along the second axis and use boolean-indexing to select -

myarray[:,np.r_[True,(myarray[0,1:] != myarray[0,:-1]).any(-1)]]

Sample run -

In [42]: myarray
Out[42]: 
array([[[1, 1],
        [1, 1],
        [4, 4],
        [4, 4],
        [2, 2],
        [3, 3],
        [1, 1]]])

In [43]: myarray[:,np.r_[True,(myarray[0,1:] != myarray[0,:-1]).any(-1)]]
Out[43]: 
array([[[1, 1],
        [4, 4],
        [2, 2],
        [3, 3],
        [1, 1]]])

Or with equality comparison and then look for ALL matches -

In [47]: myarray[:,np.r_[True,~((myarray[0,1:] == myarray[0,:-1]).all(-1))]]
Out[47]: 
array([[[1, 1],
        [4, 4],
        [2, 2],
        [3, 3],
        [1, 1]]])
Divakar
  • 218,885
  • 19
  • 262
  • 358