i have created a following dataset
dataset1 = { 'srid':[1,2,3,1,5],
'custid':[11,12,43,12,34],
'orderdate':["1/2/2019","1/2/2019","2/2/2019","1/2/2019","1/2/2019"],
'Rev':[100,101,102,103,17]
}
df1 = pd.DataFrame(dataset1)
I have to mark every sales representative as : Excellent(where his total revenue for a day is 1.5 times or greater than average revenue of all sales representative for that day) Good (where his total revenue for a day is less than 1.5 times and greater than equal to 1.1 times average revenue of all sales representative for that day) Average (where his total revenue for a day is less than 1.1 times and greater than equal to 0.9 times the average revenue of all sales representative for that day) Poor (where his total revenue for a day is less than 0.9 times the average revenue of all sales representative for that day) for every date present in ‘Dataset 1’ Output Dataset: Sales Representative ID,Order Date, Marking
what i tried is:
g=df.groupby(df['orderdate'])
ans={}
for od,od_df in g:
# print(od)
ans[od]=list()
x=od_df["Rev"].mean()
s=set(od_df["srid"].tolist())
for i in s:
p=od_df[od_df["srid"]==i]["Rev"].sum()
val = p/x
if val>=1.5:
ans[od].append([i,od,"Excellent"])
elif 1.1<=val<1.5:
ans[od].append([i,od,"good"])
elif 0.9<=val<1.1:
ans[od].append([i,od,"avg"])
else:
ans[od].append([i,od,"poor"])
But that is alot to write and will take more time on big dataset how can i optimise it further