I've been using pandas a lot lately, and have run across a slight impasse..
I have a pandas data structure, which is read in from a .fits file
d = fits.getdata('filename.fits')
df = pd.DataFrame(np.array(d))
df.columns = map(str.lower, df.columns)
containing column names like: 'n_ser_f2mf1_f850lp', 'n_ser_f3mf2_f850lp' , 'mtot_f2mf1_f850lp' , 'mtot_f3mf2_f850lp' , 'othergalaxycharacteristics_f3mf2_f8530lp'
(which if you're interested contains the difference in Sersic index for galaxies fit in a galaxy cluster that have been imaged by Hubble Space Telescope (using filter F850LP) in multiple fields --> f3mf2 meaning the galaxy is in field 3 and field 2, so we do valueinfield3 - valueinfield2)
Example of data structure/values:
a_df = pd.DataFrame(df_RXJ,columns=['global_id','mtot_f2mf1_f850lpser','n_ser_f2mf1_f850lp'])
print (a_df[285:290].head())
global_id mtot_f2mf1_f850lpser n_ser_f2mf1_f850lp
285 286.0 0.812901 -4.5086
286 287.0 0.850700 -1.4044
287 288.0 NaN NaN
288 289.0 -0.598200 2.1634
289 290.0 -0.017500 0.3278
I want to use data contained in a column as a numpy array, usually I do this:
n_ser_residuals = df.n_ser_f2mf1_f850lp.values
Which results in an array:
length(array) = numberofgalaxies
array = [ nan, nan, nan, ..., 0.46969998,
1.48409998, 0.08240002]
However, I am working with column names as strings (looping through different values like:
for p in ['f3mf2, 'f2mf1', otheroverlappingfields]:
col0name = 'n_ser_{}_f850lp'.format(p)
col1name = 'mtot_{}_f850lp'.format(p)
etc
So to access the values I use:
n_ser_residuals = (df[col0name].values)
Which instead results in an array of length 1 that looks like:
[array([ nan, nan, nan, ..., 0.46969998,
1.48409998, 0.08240002], dtype=float32)]
Why does this method result in a different output? How can I turn this output into a list?