4

I obtain a structured numpy array from the following code:

data = np.genfromtxt(fname, dtype = None, comments = '#', skip_header=1, usecols=(ucols))

where the first column is the indices of the rest of the data set in a scrambled order (which I wish to preserve). I would like to convert the structured array into a Pandas dataframe with the scrambled indices as the callable indices of the dataframe.

EDIT:

import numpy as np

test = np.array([(45,1,'mars',1,1),(67,1,'pluto',1,1),(12,1,'saturn',1,1)],dtype='i,f,U10,i,f')

creates a numpy structured array, calling the first entry gives:

In [5]: test[0]
Out[5]: (45, 1., 'mars', 1, 1.)

calling the entire array:

In [6]: test
Out[6]: 
array([(45, 1., 'mars', 1, 1.), (67, 1., 'pluto', 1, 1.),
       (12, 1., 'saturn', 1, 1.)],
      dtype=[('f0', '<i4'), ('f1', '<f4'), ('f2', '<U10'), ('f3', '<i4'), ('f4', '<f4')])

I want to turn this structured array into a pandas dataframe, and then in this example let 45,67,12 be the callable indices to access the data in the 'rows' of the array.

QuantumPanda
  • 283
  • 3
  • 12

2 Answers2

4

With the given example, you could let

df = pd.DataFrame(test).set_index('f0')

With that, you can access, say, the row whose index is 45 through df.loc[45].

fuglede
  • 17,388
  • 2
  • 54
  • 99
3

Assuming you've done import pandas as pd already:

df = pd.DataFrame(test) # converts your array to a DataFrame.
df = df.set_index('f0') # changes the index to be the first column.
  • The code I posted creates a numpy structured array, I want to convert this structured array into a pandas dataframe with the first column as the callable indices of the dataframe – QuantumPanda Jul 23 '18 at 19:29