0

Here are my initial 2-D and 1-D arrays:

a = numpy.array([
    [ 100, 2, 3, 4 ],
    [ 200, 5, 6, 7 ],
    [ 100, 8, 9, 10 ],
    [ 100, 11, 12, 13 ],
    [ 200, 14, 15, 16 ]
]

b = numpy.array([100, 200])

I would like to create a new 3-D array that looks like so:

[
    [ [ 100, 2, 3, 4 ] , [ 100, 8, 9, 10 ] , [ 100, 11, 12, 13 ] ],
    [ [ 200, 5, 6, 7 ] , [ 200, 14, 15, 16 ]
]

i.e. I want to generate a new 3-D array where each "row" (first dimension) corresponds to an element (bn) in b, and consists of an array of all elements from a where a[:, 0 == bn].

I have managed to achieve this like so:

c = np.array([
    a[a[:, 0] == bn]
    for bn in b
])

But this is quite slow (working with large datasets), and I'm wondering if there's any way to achieve this using numpy functionality instead of the for loop.

Jordan
  • 3,998
  • 9
  • 45
  • 81
  • 1
    You cannot do this with numpy arrays, all the elements in a dimension must have an equal size, which is not the case here (3 vs 2), unless you're using an object type which defeats the purpose of using numpy (and you won't be able to do much faster) – mozway Jan 26 '22 at 16:51
  • That is `c` `shape` and `dtype`? – hpaulj Jan 26 '22 at 16:53
  • 3
    Does this answer your question? [Is there any numpy group by function?](https://stackoverflow.com/questions/38013778/is-there-any-numpy-group-by-function) – Michael Szczesny Jan 26 '22 at 16:55

0 Answers0