Reformatting a dataframe without using for loops

Question

I want to convert a dataframe like:

id  event_type count

1      "a"       3
1      "b"       5
2      "a"       1
3      "b"       2

into a dataframe like:

id    a    b   a > b
1     3    5     0
2     1    0     1
3     0    2     0

Without using for-loops. What's a proper pythonic (Pandas-tonic?) way of doing this?

Is the number of same IDs guaranteed to be 2? Or can tehre also be a row `1 "c" 6`? — Stefan Falk, Feb 03 '18 at 20:15
@displayname event_type can only be two ("a" or "b"), the IDs can be n. There's not a row like ´´´1 "c" 6´´´ in the dataframe. — andandandand, Feb 03 '18 at 20:19

Stefan Falk · Accepted Answer · 2018-02-03T20:31:29.440

Well, not sure if this is exactly what you need or if it has to be more flexible than this. However, this would be one way to do it - assuming missing values can be replaced by 0.

import pandas as pd
from io import StringIO

# Creating and reading the data

data = """
id  event_type count
1      "a"       3
1      "b"       5
2      "a"       1
3      "b"       2
"""
df = pd.read_csv(StringIO(data), sep='\s+')

# Transforming

df_ = pd.pivot_table(df, index='id', values='count', columns='event_type') \
        .fillna(0).astype(int)
df_['a > b'] = (df_['a'] > df_['b']).astype(int)

Where df_ will take the form:

event_type  a  b  a > b
id                     
1           3  5      0
2           1  0      1
3           0  2      0

score 1 · Answer 2 · answered Feb 03 '18 at 20:49

1

This can be split up into two parts.

pivot see post
assign new column

Solution

df.set_index(
    [‘id’, ‘event_type’]
)[‘count’].unstack(
    fill_value=0
).assign(**{
    ‘a < b’: lambda d: d.eval(‘a < b’)
})

answered Feb 03 '18 at 20:49

piRSquared

285,575
57
475
624

Wow, I like that! – Stefan Falk Feb 03 '18 at 20:51

Reformatting a dataframe without using for loops

2 Answers2