Combine 2 rows into 1 with pandas

Question

I have a dataframe like this:

title	email_sent	click_rate
abc	10	0.1
abc	2	0.2
bcd	3	0.4

I want to combine 2 records with the same title abc, email sent will be sum of 10 and 2, click_rate will be (0.1 * 10+0.2 * 2)/12

the desired df looks like:

title	email_sent	click_rate
abc	12	0.12
bcd	3	0.4

score 1 · Accepted Answer · answered Aug 23 '22 at 16:09

You can use .apply() here.

Just define a function that takes in a sub-dataframe (for each unique title that you have), and returns a Series with the calculated values. This Series will become the row. So the index values of this Series will become the column names of the new data frame:

def f(df):
    n_emails = df['email_sent'].sum()
    click_rate = (df['click_rate'] * df['email_sent']).sum() / n_emails
    return pd.Series([n_emails, click_rate], 
                     index=['email_sent', 'click_rate'])

new_df = df.groupby('title').apply(f).reset_index()

Example:

df = pd.DataFrame([['abc', 10, 0.1], 
                   ['abc', 2,  0.2], 
                   ['bcd', 3,  0.4]], 
                  columns=['title', 'email_sent', 'click_rate'])

Result:

  title  email_sent  click_rate
0   abc        12.0    0.116667
1   bcd         3.0    0.400000

Combine 2 rows into 1 with pandas

1 Answers1