0

I am new to pandas and am trying to figure out how to transform a dataframe by grouping by rows.. The column names need to be changed depending on data. Any help will be greatly appreciated. I added the image to show original vs transformed view.

Here is the from to expected data

enter image description here

I tried the following code. The data in 2021_b should show 48.3, but with my code it is still showing 32.2. And also, column a should not be repeated, so ultimately, in this example, I should see only 3 rows.

df = pd.DataFrame({
    'a':[111, 111, 222, 222, 333, 333],
    'b':[2020, 2021, 2020, 2021, 2020, 2021],
    'c':[33.2, 48.3, 32.2, 45.5, 45.5, 78.4]
})
df = df.assign(**{'b_'+ str(df['b'].iloc[i]): lambda x: x['c'] for i in range(len(df))})
df.drop('b', axis=1, inplace=True)
df.drop('c', axis=1, inplace=True)
print(df)`
Praveena
  • 11
  • 3
  • You can do this by using pivot. Have a look: https://stackoverflow.com/questions/47152691/how-can-i-pivot-a-dataframe and the docs: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.pivot.html – Chrysophylaxs Jan 06 '23 at 23:29
  • 1
    Spoiler alert, try: `df.pivot(index="a", columns="b", values="c")` – Chrysophylaxs Jan 06 '23 at 23:29
  • Thank you Chrysophylaxs. how can I give the dynamic name to the columns... if the year is 2020, then the column name needs to be b_2020, if it is 2021, then col name needs to be b_2021 etc.. – Praveena Jan 07 '23 at 17:46
  • You can chain the rename method: `df.pivot(index="a", columns="b", values="c").rename(columns=lambda x: f"b_{x}")` – Chrysophylaxs Jan 07 '23 at 17:57
  • in the rename method, you should pass a mapping or a function to `columns=` that does what you want it to do – Chrysophylaxs Jan 07 '23 at 18:50
  • You make a dictionary that maps from current_column_name to new_column_name; or make a function that takes in the old column name and returns a new column name – Chrysophylaxs Jan 07 '23 at 18:50

0 Answers0