I have a wide format data frame with date range and empty strings as column names but the first row has some of the intended column headers, so I need a code that deduces the week from the headers then picks the column name from the first row and renames it (i.e week1_quantity, week1_sales, week1_profit)
import pandas as pd
df = pd.DataFrame([
{'Related Fields':'Description', 'Unnamed 1':'barcode',
'Unnamed 2':'department', 'Unnamed 3':'section',
'Unnamed 4':'reference', 'Sales: (06/07/2020,12/07/2020)':'Quantity',
'Unnamed 6':'amount', 'Unnamed 7':'cost',
'Unnamed 8':'% M/S', 'Unnamed 9': 'profit',
'Sales: (29/06/2020,05/07/2020)': 'Quantity',
'Unnamed 11':'amount', 'Unnamed 12':'cost',
'Unnamed 13':'% M/S', 'Unnamed 14':'profit'},
{'Related Fields':'cornflakes', 'Unnamed 1':'0001198',
'Unnamed 2':'grocery', 'Unnamed 3':'breakefast',
'Unnamed 4': '0001198', 'Sales: (06/07/2020,12/07/2020)': 60,
'Unnamed 6': 6000, 'Unnamed 7':3000, 'Unnamed 8':50,
'Unnamed 9':3000, 'Sales: (29/06/2020,05/07/2020)': 120,
'Unnamed 11':12000, 'Unnamed 12':6000, 'Unnamed 13':50,
'Unnamed 14':6000}
])
Expected result
df2 = pd.DataFrame([
{'Description':'cornflakes', 'barcode':'0001198',
'department':'grocery', 'section':'breakefast',
'reference':'0001198', 'week28_quantity':60,
'week28_amount':6000, 'week28_cost':3000,
'week28_% M/S':50, 'week28_profit':3000,
'week29_quantity':120, 'week29_amount':6000,
'week29_cost':6000, 'week29_% M/S':50,
'week28_profit':6000}
])
I've tried to change the name manually but would like an automated solution.