2

I have a pd.Series of lists.

i.e. df = pd.Series([['a', 'b'], ['c', 'd']])

I'd like to convert it to a 2d numpy array.

Doing this: np.array(df.values) doesn't yield the desired result, as the list is considered as an object.

How to get a 2d array?

IsaacLevon
  • 2,260
  • 4
  • 41
  • 83

3 Answers3

3

In your solution only convert values to lists:

print (np.array(df.values.tolist()))
[['a' 'b']
 ['c' 'd']]

Or create DataFrame first:

print (pd.DataFrame(df.values.tolist()).values)
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • For what it's worth, this solution appears to be slightly faster than the one from @yaseco using a small test dataset. `%%timeit` results: `np.array(df.values.tolist())`: 683 µs ± 20 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) `np.stack(df.values)`: 759 µs ± 50.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) – emigre459 Jul 06 '21 at 19:38
1

Just apply pd.Series:

df.apply(pd.Series).values
koPytok
  • 3,453
  • 1
  • 14
  • 29
1

Okay, I just found np.stack can do that too.

df = pd.Series([['a', 'b'], ['c', 'd']])
np.stack(df.values).shape

results

(2, 2)

IsaacLevon
  • 2,260
  • 4
  • 41
  • 83