0

Hi I followed this post to convert a series into a 2d array but it doesn't work.

The dataframe

d = {'col1': [1,1,1], 'col2': [2,2,2]}
df = pd.DataFrame(data=d)

My code

np.array(df['col1'].values.tolist())

It returns a 1d array with shape(3,)

array([1, 1, 1])

I'm looking for array with shape (3,1)

Where should I revise my code ? thanks

Osca
  • 1,588
  • 2
  • 20
  • 41

2 Answers2

4
  • df['col1'] is a series object
  • df[['col1']] is a single column dataframe

When using .to_numpy(), passing a series object will return a 1D array. However, when passing a dataframe, it will return a 2D arrays where the column and row structure is retained (in this case a single column and 3 rows)

Try using this. -

df[['col1']].to_numpy()
array([[1],
       [1],
       [1]])

#shape = (3,1)

Please refer to the documentation.

Akshay Sehgal
  • 18,741
  • 3
  • 21
  • 51
0

You can do so with .to_numpy() as follow. For reference, you may visit here.

import pandas as pd
import numpy as np

d = {'col1': [1,1,1,], 'col2': [2,2,2,]}
df = pd.DataFrame(data=d)

print(df.to_numpy())
HW Siew
  • 973
  • 8
  • 16