1

I have a multiindex dataframe. Columns 'partner', 'employer', and 'date' are the multiindex. enter image description here

partner employer date ecom sales
A a 10/01/21 1 0
A a 10/02/21 1 0
A a 10/03/21 0 1
A b 10/01/21 0 1
A b 10/02/21 1 0
A b 10/03/21 1 0
B c 10/03/21 1 0
B c 10/04/21 1 0
B c 10/04/21 0 1

I'm trying to find which unique (parter, employer) pairs have 'ecom' BEFORE 'sales'. For example, I want to have the output to be. How do I filter through each (partner, employer) pair with these conditions in python? enter image description here

partner employer date ecom sales
A a 10/01/21 1 0
A a 10/02/21 1 0
A a 10/03/21 0 1
B c 10/03/21 1 0
B c 10/04/21 1 0
B c 10/04/21 0 1
Psidom
  • 209,562
  • 33
  • 339
  • 356
gutedaama
  • 11
  • 2
  • Have a look here https://stackoverflow.com/a/20159305/463796 on how to make your question a good reproducible example, in code. – w-m Oct 07 '21 at 10:37

1 Answers1

0

Try this:

# Find the first date where ecom or sales is not 0
first_date = lambda col: col.first_valid_index()[-1]
tmp = df.replace(0, np.nan).sort_index().groupby(level=[0,1]).agg(
    first_ecom=('ecom', first_date),
    first_sales=('sales', first_date)
)

# The (partner, employer) pairs where ecom happens before sales
idx = tmp[tmp['first_ecom'] < tmp['first_sales']].index

# Condition to filter the original frame
cond = df.index.droplevel(-1).isin(idx)

# Result
df[cond]
Code Different
  • 90,614
  • 16
  • 144
  • 163