Get the column names of a python numpy array

Question

I have a csv data file with a header indicating the column names.

xy   wz  hi kq
0    10  5  6
1    2   4  7
2    5   2  6

I run:

X = np.array(pd.read_csv('gbk_X_1.csv').values)

I want to get the column names:

['xy', 'wz', 'hi', 'kg']

I read this post but the solution provides me with None.

np.genfromtxt() and names=True option might help. See https://stackoverflow.com/questions/12336234/read-csv-file-to-numpy-array-first-row-as-strings-rest-as-float — dkato, Dec 01 '17 at 07:41
I think you need `pd.read_csv('gbk_X_1.csv').columns.tolist()` — jezrael, Dec 01 '17 at 07:46
Is your problem getting the structured array or getting the names out of the structured array? If the latter: `list(x.dtype.fields)`. — Paul Panzer, Dec 01 '17 at 08:05
Yes, It is also possible to use: `X = np.genfromtxt('gbk_X_1.csv', dtype=float, delimiter=',', names=True) print(X.dtype.names)` — ebrahimi, Dec 01 '17 at 09:33

score 4 · Answer 1 · answered Dec 01 '17 at 08:04

Let's assume your csv file looks like

xy,wz,hi,kq
0,10,5,6
1,2,4,7
2,5,2,6

Then use pd.read_csv to dump the file into a dataframe

df = pd.read_csv('gbk_X_1.csv')

The dataframe now looks like

df

   xy  wz  hi  kq
0   0  10   5   6
1   1   2   4   7
2   2   5   2   6

It's three main components are the

data which you can access via the values attribute

df.values

array([[ 0, 10,  5,  6],
       [ 1,  2,  4,  7],
       [ 2,  5,  2,  6]])

index which you can access via the index attribute

df.index

RangeIndex(start=0, stop=3, step=1)

columns which you can access via the columns attribute

df.columns

Index(['xy', 'wz', 'hi', 'kq'], dtype='object')

If you want the columns as a list, use the to_list method

df.columns.tolist()

['xy', 'wz', 'hi', 'kq']

score 4 · Accepted Answer · edited Dec 16 '17 at 09:14

4

Use the following code:

import re

f = open('f.csv','r')

alllines = f.readlines()
columns = re.sub(' +',' ',alllines[0]) #delete extra space in one line
columns = columns.strip().split(',') #split using space

print(columns)

Assume CSV file is like this:

xy   wz  hi kq
0    10  5  6
1    2   4  7
2    5   2  6

edited Dec 16 '17 at 09:14

marc_s

732,580
175
1,330
1,459

answered Dec 01 '17 at 08:10

Ahmad

906
11
27

Get the column names of a python numpy array

2 Answers2