Calculate count features by group in a pandas dataframe

Question

I have the following dataset:

import pandas as pd
from datetime import datetime
import numpy as np

date_rng = pd.date_range(start='2020-07-01', end='2020-07-10', freq='d')
l1 = [np.nan, np.nan, "local_max", np.nan, np.nan, "local_min", np.nan, np.nan, "local_max", np.nan]
l2 = [np.nan, np.nan, "local_max", np.nan, np.nan, "local_min", np.nan, np.nan, "local_max", "local_min"]

df = pd.DataFrame({
    'date':date_rng,
    'value':l1,
    'group':'a'
})
df2 = pd.DataFrame({
    'date':date_rng,
    'value':l1,
    'group':'b'
})

df = df.append(df2, ignore_index=True)

I want to calculate features,such as count of local_min and local_max per group and save it in a new dataframe with the desired output:

I able to calculate features but fail to apply it to the group in a elegant way:

columns = ["group", "local_min", "local_max"]

df_features = pd.DataFrame([["a", 1, 2],
                            ["b", 1, 3],],
                  columns=columns)
df_features

Any help would be much appreciated!

score 1 · Accepted Answer · answered Oct 12 '20 at 10:32

1

df.groupby works:

df.groupby(['group','value']).count()

output:

                 date
group value          
a     local_max     2
      local_min     1
b     local_max     2
      local_min     1

answered Oct 12 '20 at 10:32

Jim Eisenberg

1,490
1
9
17

score 0 · Answer 2 · answered Oct 12 '20 at 10:30

0

Try with pivot_table:

pd.pivot_table(df, index='group', columns='value', aggfunc='count')

           date
value local_max local_min
group
a             2         1
b             2         1

answered Oct 12 '20 at 10:30

IoaTzimas

10,538
2
13
30

Calculate count features by group in a pandas dataframe

2 Answers2