why df.groupby().apply() calculate the first group twice

Asked Oct 24 '18 at 05:25

Active Oct 24 '18 at 05:25

Viewed 45 times

When I use groupby().apply() function to calculate some data like wighted average. I found that the first group is always calculated twice. For example:

def test(dataframe):
    df = dataframe.copy()
    a = df['a'].iloc[0]
    b = df['b'].mean()
    result.append([a,b])

df = pd.DataFrame({'a':[1,1,1,2,2,2,2,3,3,3],'b':[1,2,3,4,5,6,7,8,9,10]})
df.groupby('a').apply(test)
result = pd.DataFrame(result, columns=['a', 'b'])

then I get:

As you can see, the first group is calculated twice. I don't know why.

asked Oct 24 '18 at 05:25

xxyao

2

I suggest use `def test(dataframe): a = dataframe['a'].iloc[0] b = dataframe['b'].mean() return pd.Series([a,b])` and then `result = df.groupby('a', as_index=False).apply(test)` – jezrael Oct 24 '18 at 05:30
1

@jezrael Yes your code solve my question perfectly and thank you! – xxyao Oct 24 '18 at 05:44

why df.groupby().apply() calculate the first group twice

0 Answers0