[SOLVED by Changing the date format for pandas to parse]
I am trying to calculate Month Name and I checked there is a massive difference when I am using dt.month_name and Excel's month to get the month of transaction.
Context: I am running an analysis on my own bank transaction. For the simplicity, I have two dates - Value date and Transaction date. I have converted both to datetime format and used dt.month to get the month names. After facing some problems and doing a RCA, I came to the conclusion that the value_counts of months calculated in excel is vastly different from the ones calculated in Python.
This is after using Transaction date to calculate the month value (py_Month)
This is after using Value Date to calculate the month (py_month)
The Entire code is as follows:
data = pd.read_excel("data.xlsx",sheet_name="Sheet1")
data = data.drop(["S No."],axis=1)
# Convert to datetime.
date_cols = ["Value Date","Transaction Date"]
data[date_cols] = data[date_cols].apply(pd.to_datetime)
val_date = "Value Date"
tnx_date = "Transaction Date"
data["py_Month"] = data[val_date].dt.month_name()
# data["py_Month"] = data[tnx_date].dt.month_name()
vc_df = data[['xl_month', 'py_Month']].apply(pd.Series.value_counts)
vc_df