I have a large spreadsheet which I have imported using pandas.
The first x set of columns can be considered necessary for every row. The next y set of columns are actually just 9 repeated column headers in 7 groups, but with different values for each row. I want to try and create a new dataframe, which separates out each of those repeated column headers and joins them with the necessary x columns. I have tried many ways, but I cannot get it to work.
As a simple example, lets say we have the following dataframe:
ID | Name | Category | Year | Choice | Price | Year | Choice | Price | Year | Choice | Price |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | Test | AM | 2022 | A | 10 | 2022 | B | 11 | 2023 | B | 11.5 |
This should be transformed into this format:
ID | Name | Category | Year | Choice | Price |
---|---|---|---|---|---|
1 | Test | AM | 2022 | A | 10 |
1 | Test | AM | 2022 | B | 11 |
1 | Test | AM | 2023 | B | 11.5 |
I feel like this should be a whole lot easier than it has been. Please help out.
NOTE: There are actually 18 necessary headers, followed by 7 groups of 9 headers. The 7 groups all have the same header name (which is possibly why my efforts to use .drop and .iloc
have not worked properly. I am a python beginner.
Latest attempt was to create a new copy of the df for each of the blocks. This produces an error
ValueError: operands could not be broadcast together with shapes (9,) (44,)
new_dfA = dfA.drop(dfA.columns[27:], axis=1).reset_index(drop=True)
new_dfB = dfB.drop(dfB.columns[18:27] + list(df.columns[37:]), axis=1).reset_index(drop=True)
new_dfC = dfC.drop(dfC.columns[18:36] + list(df.columns[46:]), axis=1).reset_index(drop=True)
new_dfD = dfD.drop(dfD.columns[18:45] + list(df.columns[55:]), axis=1).reset_index(drop=True)
new_dfE = dfE.drop(dfE.columns[18:54] + list(df.columns[64:]), axis=1).reset_index(drop=True)
new_dfF = dfF.drop(dfF.columns[18:63] + list(df.columns[72:]), axis=1).reset_index(drop=True)
new_dfG = dfG.drop(dfG.columns[18:72] + list(df.columns[80:]), axis=1).reset_index(drop=True)