Actually, you need to do pivot
, not transpose
.
Try: df.pivot(index='Hierarchy', columns='Fiscal_Mo', values='Amount').fillna('')
.
The final fillna('')
is required to replace NaN
values for non-existing
combinations of Fiscal_Mo
and Hierarchy
, e.g. 201802
/ A
.
Full code:
import pandas as pd
df = pd.DataFrame( {'Fiscal_Mo': ['201801', '201802', '201803', '201801'],
'Hierarchy': ['A', 'B', 'C', 'B'], 'Amount': [ 100, 20, 300, 400 ] } )
df.pivot(index='Hierarchy', columns='Fiscal_Mo',values='Amount').fillna('')
A remark concerning your comment: It contains hierarchy
written with
lower case h
. Maybe this is the source of your problem?
Edit
Maybe Hierarchy
is the index of your DataFrame.
In this case, the index of pivot
table is to bo created just from the
index of the source DataFrame, so index
parameter should be omitted.
The corresponding code to create the DataFrame is:
import pandas as pd
ind = pd.Index(data=['A', 'B', 'C', 'B'], name='Hierarchy')
df = pd.DataFrame( data={'Fiscal_Mo': ['201801', '201802', '201803', '201801'],
'Amount': [ 100, 20, 300, 400 ] }, index=ind)
and the command to create the pivot
table is:
df.pivot(columns='Fiscal_Mo', values='Amount').fillna('')
Edit 2
Investigation concerning header=None
The first attempt: I executed:
df = pd.read_excel('Report.xlsx', sheet_name='Report', header=None)
and printed the df
, the result was:
0 1 2
0 Fiscal_Mo Hierarchy Amount
1 201801 A 100
2 201802 B 20
3 201803 C 300
4 201801 B 400
As you can see:
- Column names in the DataFrame are consecutive numbers (wrong).
- Column names from the source file (Fiscal_Mo, Hierarchy, Amount)
are values of the first row (wrong).
Here, any attempt to call any Pandas function and passing "textual"
column names will fail.
The second attempt: I executed:
df = pd.read_excel('Report.xlsx', sheet_name='Report')
Note that this time there is no header
parameter, so the function
assumes header=0
- read column names from row 0 (initial).
When I printed the df
, the result was:
Fiscal_Mo Hierarchy Amount
0 201801 A 100
1 201802 B 20
2 201803 C 300
3 201801 B 400
As you can see:
- Column names in the DataFrame are taken from row number 0 (OK).
- Data rows are taken from "further" rows (OK).
Now you can call e.g.:
df.pivot(index='Hierarchy', columns='Fiscal_Mo', values='Amount').fillna('')
without any errors.
Note that:
- Parameter values must exactly match respective column names
(lower / upper case letters).
- I called
df.pivot(...)
not pd.pivot(df, ...)
. Apparently you failed
to read carefully my instructions and copy all parameters to your code.
Conclusion: Delete header=None
from read_excel
call and your
program should work.