How to create an alert on username duplicate with python pandas

Question

I was trying to write a script that gives me an alert if a username is being duplicated if he has a level of high depression here is the data samples

and here is my code:

import pandas as pd

df = pd.read_csv("Path")

username_grp = df.groupby(['username'])
filt = df['username'] == 'ali'

print(username_grp.get_group("ali"))
print(username_grp['level'].value_counts()) 
print(username_grp['level'].value_counts().loc['ali'])

Please supply the expected [minimal, reproducible example](https://stackoverflow.com/help/minimal-reproducible-example) (MRE). We should be able to copy and paste a contiguous block of your code, execute that file, and reproduce your problem along with tracing output for the problem points. This lets us test our suggestions against your test data and desired output. Please [include a minimal data frame](https://stackoverflow.com/questions/52413246/how-to-provide-a-reproducible-copy-of-your-dataframe-with-to-clipboard) as part of your MRE. — Prune, Aug 12 '21 at 21:44

score 1 · Accepted Answer · answered Aug 12 '21 at 21:53

1

Use value_counts:

>>> df[df['level'] == 'high'].value_counts('username').gt(0).index.tolist()
['ali']

answered Aug 12 '21 at 21:53

Corralien

109,409
8
28
52

Thank you for sharing, but what if i have multiple username that is being duplicated in the same sheet ? – Ali.M.Kamel Aug 12 '21 at 22:50
1

You will see the full list of duplicated username that have a high level of depression. – Corralien Aug 12 '21 at 22:54
it's working fine except that .gt doesn't work since i'm trying to retrieve only dublicated name more than 2 that has high level of depression, – Ali.M.Kamel Aug 14 '21 at 09:49

score 0 · Answer 2 · answered Aug 15 '21 at 10:24

Thanks to @Corralien line of code, I came out with this solution, this solution simply prints out the username that is being duplicated in the excel file and if the row is duplicated for more than 3 times with all high level on another column then it appends it to a list, please comment out here if you have a better solution !

import pandas as pd


df = pd.read_csv("PATH")
username_len = df['username']


Medical_alert_list = []
for i in range(len(username_len)):
    try:
       Username = df[df['level'] == 'high']['username'][i]
       if df[df['level'] == 'high'].value_counts('username')["{}".format(Username)] >= 3:
          duplicates = df[df['level'] == 'high']['username'][i]
          Medical_alert_list.append(duplicates)
       else:
         pass
    except:
       pass

final_new_menu = list(dict.fromkeys(Medical_alert_list))
alert = "\033[31m[!]\033[0m "

for i in final_new_menu:
   print("{}{}".format(alert,i))

How to create an alert on username duplicate with python pandas

2 Answers2