-2

I have two dataframes df_inv df_inv

and df_sales. df_sales

I need to add a column to df_inv with the sales person name based on the doctor he is tagged in df_sales. This would be a simple merge I guess if the sales person to doctor relationship in df_sales was unique. But There is change in ownership of doctors among sales person and a row is added with each transfer with an updated day. So if the invoice date is less than updated date then previous tagging should be used, If there are no tagging previously then it should show nan. In other word for each invoice_date in df_inv the previous maximum updated_date in df_sales should be used for tagging.

The resulting table should be like this Final Table

I am relatively new to programming but I can usually find my way through problems. But I can not figure this out. Any help is appreciated

  • 1
    Welcome to StackOverflow. Please take the time to read this post on [how to provide a great pandas example](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) as well as how to provide a [minimal, Complete, and Verifiable example](https://stackoverflow.com/help/mcve) and revise your question accordingly. These tips on how to ask a good question may also be useful. – yatu Jun 01 '19 at 17:11

1 Answers1

0
import pandas as pd
import numpy as np
df_inv = pd.read_excel(r'C:\Users\joy\Desktop\sales indexing\consolidated report.xlsx')
df_sales1 = pd.read_excel(r'C:\Users\joy\Desktop\sales indexing\Sales Person 
tagging.xlsx')
df_sales2 = df_sales1.sort_values('Updated Date',ascending=False)

df_sales = df_sales2.reset_index(drop=True)

sales_tag = []
sales_dup = []
counter = 0
for inv_dt, doc in zip(df_inv['Invoice_date'],df_inv['Doctor_Name']):

    for sal, ref, update in zip(df_sales['Sales 
    Person'],df_sales['RefDoctor'],df_sales['Updated Date']):

    if ref==doc:
        if update<=inv_dt and sal not in sales_dup :
            sales_tag.append(sal)
            sales_dup.append(ref)
            break
        else:
            pass

    else:
        pass
sales_dup = []
counter = counter+1
if len(sales_tag)<counter:
    sales_tag.append('none')
else:
    pass

df_inv['sales_person'] = sales_tag

This appears to work.