0

I am looking to pivot the feature column into column headers.

here is the df

enter image description here

for instance neither of the below work. I've tried multiple variations to no avail.

df.groupby(['feature', 'year'])['value'].unstack(fill_value=0)

df.pivot_table(index='year', columns='feature', values='value')

The end goal would look like this.

enter image description here

Joe Rivera
  • 307
  • 2
  • 11
  • kindly post data, not images/pics. use this as a guide on how to post minimum viable reproducible example : [link](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) – sammywemmy Mar 29 '20 at 02:03
  • can you provide your df as code as opposed to an excel screengrab? – Umar.H Mar 29 '20 at 02:04
  • 1
    Use `pivot` instead of `pivot_table`. `pivot` is a simple reshaping of the data, which is what you want here. `pivot_table` is more when you want to aggregate (or reshape with MultiIndices). You'd need to use `aggfunc='first'` with `pivot_table` to deal with strings as the default assumption is you aggregate with `mean` with `pivot_table`. – ALollz Mar 29 '20 at 02:18

2 Answers2

2

You are going from a long layout to wide layout. See the sample below.

import pandas as pd

df_long = pd.DataFrame({
        "student":
            ["Andy", "Bernie", "Cindy", "Deb",
             "Andy", "Bernie", "Cindy", "Deb",
             "Andy", "Bernie", "Cindy", "Deb"],
        "school":
            ["Z", "Y", "Z", "Y",
             "Z", "Y", "Z", "Y",
             "Z", "Y", "Z", "Y"],
        "class":
            ["english", "english", "english", "english",
             "math", "math", "math", "math",
             "physics", "physics", "physics", "physics"],
        "grade":
            [10, 100, 1000, 10000,
             20, 200, 2000, 20000,
             30, 300, 3000, 30000]
})
df_long

enter image description here

df_long.pivot_table(index=["student", "school"], 
                    columns='class', 
                    values='grade').reset_index()

enter image description here

ASH
  • 20,759
  • 19
  • 87
  • 200
1

@ALollz tipped me off on this one. I should have been using pivot and not pivot_table.

df = pd.pivot(df, values='value', columns='feature', index='year')
Joe Rivera
  • 307
  • 2
  • 11