4

I have a dataframe, and I set the index to a column of the dataframe. This creates a hierarchical column index. I want to flatten the columns to a single level. Similar to this question - Python Pandas - How to flatten a hierarchical index in columns, however, the columns do not overlap (i.e. 'id' is not at level 0 of the hierarchical index, and other columns are at level 1 of the index).

df = pd.DataFrame([(101,3,'x'), (102,5,'y')], columns=['id', 'A', 'B'])
df.set_index('id', inplace=True)

      A    B
id
101   3    x
102   5    y

Desired output is flattened columns, like this:

id    A    B
101   3    x
102   5    y
Community
  • 1
  • 1
Nick D
  • 430
  • 1
  • 4
  • 11

3 Answers3

1

there will always be an index in your dataframes. if you don't set 'id' as index, it will be at the same level as other columns and pandas will populate an increasing integer for your index starting from 0.

df = pd.DataFrame([(101,3,'x'), (102,5,'y')], columns=['id', 'A', 'B'])

In[52]: df
Out[52]: 
    id  A  B
0  101  3  x
1  102  5  y

the index is there so you can slice the original dataframe. such has

df.iloc[0]
Out[53]: 
id    101
A       3
B       x
Name: 0, dtype: object

so let says you want ID as index and ID as a column, which is very redundant, you could do:

df = pd.DataFrame([(101,3,'x'), (102,5,'y')], columns=['id', 'A', 'B'])
df.set_index('id', inplace=True)
df['id'] = df.index
df
Out[55]: 
     A  B   id
id            
101  3  x  101
102  5  y  102

with this you can slice by 'id' such has:

df.loc[101]
Out[57]: 
A       3
B       x
id    101
Name: 101, dtype: object

but it would the same info has :

df = pd.DataFrame([(101,3,'x'), (102,5,'y')], columns=['id', 'A', 'B'])
df.set_index('id', inplace=True)
df.loc[101]

Out[58]: 
A    3
B    x
Name: 101, dtype: object
Steven G
  • 16,244
  • 8
  • 53
  • 77
1

Given:

>>> df2=pd.DataFrame([(101,3,'x'), (102,5,'y')], columns=['id', 'A', 'B'])
>>> df2.set_index('id', inplace=True)
>>> df2
     A  B
id       
101  3  x
102  5  y

For printing purdy, you can produce a copy of the DataFrame with a reset the index and use .to_string:

>>> print df2.reset_index().to_string(index=False)
id  A  B
101  3  x
102  5  y

Then play around with the formatting options so that the output suites your needs:

>>> fmts=[lambda s: u"{:^5}".format(str(s).strip())]*3
>>> print df2.reset_index().to_string(index=False, formatters=fmts)
id     A      B
101    3      x  
102    5      y
dawg
  • 98,345
  • 23
  • 131
  • 206
1

You are misinterpreting what you are seeing.

     A  B
id       
101  3  x
102  5  y

Is not showing you a hierarchical column index. id is the name of the row index. In order to show you the name of the index, pandas is putting that space there for you.

The answer to your question depends on what you really want or need.

As the df is, you can dump it to a csv just the way you want:

print(df.to_csv(sep='\t'))

id  A   B
101 3   x
102 5   y

print(df.to_csv())

id,A,B
101,3,x
102,5,y

Or you can alter the df so that it displays the way you'd like

print(df.rename_axis(None)) 

     A  B
101  3  x
102  5  y

please do not do this!!!!
I'm putting it to demonstrate how to manipulate

I could also keep the index as it is but manipulate both column and row index names to print how you would like.

print(df.rename_axis(None).rename_axis('id', 1))

id   A  B
101  3  x
102  5  y

But this has named the columns' index id which makes no sense.

piRSquared
  • 285,575
  • 57
  • 475
  • 624
  • Hi could you help answer [this](https://stackoverflow.com/questions/64395699/how-to-drop-row-index-and-flatten-index-in-this-way) – Scope Oct 22 '20 at 11:14