I'm making a pivot table from a CSV (cl_total_data.csv) file using pandas pd.pivot_table() and need find a fix to values in the wrong rows.
[Original CSV File]
The error occurs when the year has 53 weeks(i.e. 53 values) instead of 52, the first value in the year with 53 weeks is set as the last value in the pivot table
[Pivot Table with wrong values top]
[Pivot Table with wrong values bottom]
[Original CSV 2021 w/ 53 values]
The last value for the pivot table 2021 row 53 (1123544) is the first value of the year for 2021-01-01 (1123544) in the original CSV table for 2021.
I figured out how to fix this in the pivot table after making it. I use
Find columns with 53 values:
cl_total_p.columns[~cl_total_p.isnull().any()]
Then take the values from the original CSV files to its corresponding year and replace the values in the pivot table
cl_total_p[2021] = cl_total_data.loc['2021'].Quantity.values
My problem is:
- I can't figure out what I'm coding wrong in the pivot table function that causes this misplacement of values. Is there a better way to code it?
- Using my manual solution takes a lot of time especially when I'm using multiple CSV files 10+ and having to fix every single misplacement in columns with 53 weeks. Is there a for loop I can code to loop through all columns with 53 weeks and replace them with their corresponding year?
I tried
import numpy
import pandas
year_range = np.arange(1982,2023)
week_range = np.arange(54)
for i in year_range:
for y in week_range:
cl_total_p[i] = cl_total_data.loc['y'].Quantity.values
But I get an error :( How can I fix the pivot table value misplacement? and/or find a for loop to take the original values and replace them in the pivot table?