I have the below data frame and am looking to pivot to the below desired format. I've been playing with the parameters and haven't been able to achieve my expected results due to duplicate values and the below error being thrown:
"ValueError: Index contains duplicate entries, cannot reshape"
import pandas as pd
dummy_data = {
'Field_Names': ['Field_1','Field_2','Field_3','Field_1','Field_2','Field_3'],
'Field_Values': ['Bob','STL','1234','Eric','DSM','5678'],
'Record_ID': ['abc123','abc123','abc123','abc123','abc123','abc123']
}
dummy_df = pd.DataFrame(dummy_data)
display(dummy_df)
Desired Output:
details = {'Field_1': ['Bob', 'Eric'],
'Field_2': ['STL', 'DSM'],
'Field_3': ['1234', '5678'],
'Field_ID': ['abc123','abc123']}
desired_output = pd.DataFrame(details)
display(desired_output)
I have tried variations of the below as well as .pivot_table() after reading SO threads with no luck and am unsure what to try next as there are a myriad of solutions out there and I'm not sure which is best for my use case.
final_df = dummy_df.groupby('Field_Names')['Field_Values','Record_Id'].apply(list)
final_df = pd.DataFrame({i:final_df.loc[i] for i in final_df.index.to_list()})
The commented solution also does not work and throws our duplicate error:
dummy_df.pivot(index='Record_ID', columns='Field_Names', values='Field_Values').reset_index()