0

I've been having a hard time finding an answer for this online, so I thought I'd query good ol' Stack Overflow.

I have this example:

v = np.array([
    np.array([1, 1]),
    np.array([1, 2]),
    np.array([1, 3]),
    np.array([1, 4]),
    np.array([1, 5]),
    np.array([2, 1]),
    np.array([2, 2]),
    np.array([2, 3]),
    np.array([3, 1]),
    np.array([3, 2]),
    np.array([3, 3]),
    np.array([3, 4]),
    np.array([4, 1]),
    np.array([4, 2]),
    np.array([4, 3]),
    np.array([4, 4]),
    np.array([4, 5]),
    np.array([4, 6]),
])

k = np.split(v[:, 1], np.unique(v[:, 0], return_index=True)[1][1:])

# output below

[
    np.array([1,2,3,4,5]),
    np.array([1,2,3]),
    np.array([1,2,3,4]),
    np.array([1,2,3,4,5,6])
]

My aim is to select the first and last element of each array in the output list. What I'd like to do is something like:

k = np.array(k, dtype=object)
new_k = (([:, 0], [:, -1]))

But alas, this is not possible. Maybe there is a way to rewrite the line that creates k to just have the first and last item?

Notice that I am trying to accomplish this with no list comprehension, defining functions, or loops - just "vanilla" numpy. If that's not feasible, any direction toward the next most efficient way of accomplishing this would be great.

Shmack
  • 1,933
  • 2
  • 18
  • 23
  • 2
    You have a list of arrays. Wrapping it in an object dtype array does not change that essential nature. – hpaulj Jun 15 '23 at 05:23
  • `[a[[0, -1]] for a in out]`, you can't do it in numpy alone in a vectorial way – mozway Jun 15 '23 at 05:26
  • @hpaulj This question (both the comment and post) is going to show my ignorance, but I don't get it. `np.array()` takes in a list and converts it to a ctype array. Why would I expect anything different? – Shmack Jun 15 '23 at 05:27
  • Because numpy arrays make sense when they don't have an object dtype. They are otherwise no better than python lists. – mozway Jun 15 '23 at 05:28
  • @mozway Okay, sure. But there is nothing "better" than list comprehension to solve this problem? I can't rewrite the initial line that creates the list? – Shmack Jun 15 '23 at 05:31
  • 1
    `np.split` used a list comprehension to creat the list of slices. You stuck with treating each of those arrays individually. – hpaulj Jun 15 '23 at 13:43

1 Answers1

1

The np.unique operation selects the first occurrence of a value. Doing that with the normal array means it selects the first of each group, and doing the same for the reversed array means it selects the last of each group.

import numpy as np

v = np.array([
    [1, 1],
    [1, 2],
    [1, 3],
    [1, 4],
    [1, 5],
    [2, 1],
    [2, 2],
    [2, 3],
    [3, 1],
    [3, 2],
    [3, 3],
    [3, 4],
    [4, 1],
    [4, 2],
    [4, 3],
    [4, 4],
    [4, 5],
    [4, 6],
])

first_values = v[np.unique(v[:, 0], return_index=True)[1], 1]
last_values = v[::-1][np.unique(v[::-1, 0], return_index=True)[1], 1]

k = np.vstack((first_values, last_values)).T

This gives the desired result,

array([[1, 5],
       [1, 3],
       [1, 4],
       [1, 6]])
jared
  • 4,165
  • 1
  • 8
  • 31
  • Would you have to `.T` if you used hstack? – Shmack Jun 15 '23 at 16:25
  • 1
    `hstack` wouldn't work because `first_values` and `last_values` are shape `(4,)` not `(4,1)`. Using `hstack` here would just create an array of size `(8,)` rather than the desired `(4,2)`. To use `hstack`, you'd need to either (1) reshape the arrays first or (2) `np.hstack((first_values[:,None], last_values[:,None]))`. To me, using `vstack` with a transpose is clearer. – jared Jun 15 '23 at 17:15
  • Awesome, thanks for your answer! – Shmack Jun 15 '23 at 17:18
  • 2
    A variante on the `vstack`: `np.stack((first_values, last_values), axis=1)` – hpaulj Jun 15 '23 at 18:13