0

I want to aggregate rows, using different conditions for two columns.

When I do df.groupby('[a]').agg('count'), I get the output 1

When I do df.groupby('[a]').agg('mean'), I get the output 2

Is there a way to do an aggregation that show output 1 to the column[b] and output 2 to the column[c]?

Bill Armstrong
  • 1,615
  • 3
  • 23
  • 47
Oalvinegro
  • 458
  • 5
  • 21

1 Answers1

1

Code below should work:

# Import libraries
import pandas as pd
import numpy as np

# Create sample dataframe
df = pd.DataFrame({'a': ['A1', 'A1', 'A2', 'A3', 'A4', 'A3'],
                   'value': [1,2,3,4,5,6]})

enter image description here

# Calculate count, mean 
temp1 = df.groupby(['a']).count().reset_index().rename(columns={'value':'count'})
temp2 = df.groupby(['a'])['value'].mean().reset_index().rename(columns={'value':'mean'})

# Add columns to existing dataframe
df.merge(temp1, on='a', how='inner').merge(temp2, on='a', how='inner')

enter image description here

# Add columns to a new dataframe
df2 = temp1.merge(temp2, on='a', how='inner')
df2

enter image description here

Nilesh Ingle
  • 1,777
  • 11
  • 17