0

I import data in a pandas dataframe from as SQL database. Every row includes a id, a parameter name, and the corresponding parameter value. Just like in this stripped down example:

import pandas as pd
data =  [
            ['aaa', 'A', 0],
            ['bbb', 'A', 1],          
            ['aaa', 'B', 2],
            ['bbb', 'B', 3],
            ['aaa', 'C', 4],
            ['bbb', 'C', 5],
]
df = pd.DataFrame(data,  columns=['id', 'name', 'value'])
df = df.set_index('id')
print(df)
    name  value
id             
aaa    A      0
bbb    A      1
aaa    B      2
bbb    B      3
aaa    C      4
bbb    C      5

To analyse the correlations between different parameters A, B, C, [...] I need a data frame in the format:

     A  B  C
id          
aaa  0  2  4
bbb  1  3  5

I guess this must be a relative common situation, but I can not figure out how to do this in a elegant way.

hbar
  • 43
  • 2

1 Answers1

1

Yes there is an easy solution with pivot_table:

output = df.pivot_table(index='id', columns='name')
print(output)

    value      
name     A  B  C
id              
aaa      0  2  4
bbb      1  3  5

Robert King
  • 974
  • 5
  • 16
  • 1
    Guessed that this must be simple. Thank you! For anybody how stumbles accros this question https://pandas.pydata.org/pandas-docs/stable/user_guide/reshaping.html gives a more detailed explanation (easy to find if you know the correct keywords). – hbar Dec 12 '19 at 14:05