0

I am using Pivot_table to summarize some information for my data. However I am keep getting the Keyerror for column which is not used in pivot_table call and I can't seem to figure out what's wrong. I have tried numerous solution proposed by other posts but none of them are working.

So, I really need some expert help.

Here is my data frame:

H1   H2       H3         H4     H5  H6     H7    H8    H10
RA1  RB1, H2 2020-07-25  11:30  60  1764   RG1   RH1   RJ1
RA2  RB2, H2 2020-07-25  11:30  60  1774   RG2   RH2   RJ2
RA3  RB3, H2 2020-07-25  11:30  60  1791   RG3   RH3   RJ3
RA4  RB4, H2 2020-07-25  11:30  35  1806   RG4   RH4   RJ4
RA5  RB1, H3 2020-07-25  12:30  35  1771   RG5   RH5   RJ5
RA6  RB2, H3 2020-07-25  12:45  60  1813   RG6   RH6   RJ6
RA7  RB3, H3 2020-07-25  13:00  60  1789   RG7   RH7   RJ7
RA8  RB4, H3 2020-07-25  13:00  60  1790   RG8   RH8   RJ8
RA9  RB1, H4 2020-07-25  13:00  60  1808   RG9   RH9   RJ9
RA10  RB2, H4 2020-07-25  14:00  60  1822  RG10  RH10  RJ10

Here is my code:

import pandas as pd
import numpy as np
import os, shutil, inspect, xlsxwriter

from os import listdir
from os.path import isfile, join

class GetMyData():
    def __init__(self, directory):
        self.datafl = os.path.join(directory, 'data.xlsx')
        self.pivotfl = os.path.join(directory, 'pivot.xlsx')

    def getDataFrame(self):
        try:
            df = pd.DataFrame()
            read_order = ('H1', 'H2', 'H3', 'H4', 'H5','H6', 'H7',\ 
                          'H8','H10')

            # Read Dataframe.
            df = pd.read_excel(self.datafl, usecols=read_order, index_col=None)
            return (df)        
        except (Exception, FileNotFoundError, IOError) as e:
            print("Function:", inspect.stack()[0][3], "Raised: ", e)

    # Create Pivot Table
    def createPivot(self):
        try: 
            df = pd.DataFrame()

            df = self.getDataFrame()

            df['H3'] = pd.to_datetime(df['H3'], utc=False) - pd.to_timedelta(7, unit='d')
            
            print (df)
            # Summarize the data in Pivot table.
            pivot = pd.pivot_table(df, index=['H1', pd.Grouper(key='H3', freq='W-MON')], columns='H10',\
                margins=True, aggfunc={'H10':np.count_nonzero}).reset_index()
            
            # Write Dataframe to Excel file.
            df.to_excel(self.pivotfl, sheet_name='Result', startrow=0)
        except (Exception, FileNotFoundError, IOError) as e:
            print("Function:", inspect.stack()[0][3], "Raised: ", e)

if __name__ == "__main__":
    data = GetMyData(os.getcwd())
    data.createPivot()

When I am running this code I am getting following Exception:

Function: createPivot Raised:  'H2'

As I said I have tried many things like resetting index and other solutions but none of them seems to work.

Please let me know if there is anything else I can try or provide your expert opinion to show meaningful error.

Really appreciate your help.

Thank You.

DjangoRulz
  • 85
  • 7
  • Can you provide some sample input data and expected output (no images)? Please see https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples Thank you! – David Erickson Oct 07 '20 at 23:31
  • Be advised that images of data/codes are discouraged on SO. – Quang Hoang Oct 08 '20 at 00:05
  • Sorry about pasting the image. I reduced my dataset and added into original post. Please let me know if you can help me further. Really appreciate your help. – DjangoRulz Oct 08 '20 at 00:31

0 Answers0