Using .drop with Pandas to drop a single row by a value

Question

Have code to send out emails, some rows have the same name of a person to send a email to, but each rows have a unique value. I'm trying to get it to send one email with all the values where there are duplicate cells in the name column with the same name, but all unique values in those rows.

So to this I tried making a duplicate data frame by pulling the the rows with duplicate rows to make it easier for if else statements. Since when I send out emails it sends the same values in different emails multiple times I tried making it so every time it sends out a email it would drop the row of the value it used.

At the end of the code instead of dropping a single row from a single value it drops every row with a value.

import smtplib
import pandas as pd
import ssl
import xlrd

your_name = "Name Here"
your_email = "Email you using here"
your_password = "password to email here"
cc = input("Email of person you want to cc in email: ")

# for gmail change smtp-mail.outlook.com to smtp.gmail.com, 465
server = smtplib.SMTP_SSL('smtp-mail.outlook.com', 587)
server.ehlo()
server.login(your_email, your_password)

data = [['bob', 'testemail@gmail.com', 'howdy'], ['joe', 
'testemail@gmail.com', 'hi'], ['bill', 'testemail@gmail.com', 'hey'], 
['bob', 'testemail@gmail.com', 'hola'],['bob', 'testemail@gmail.com',    
'hello'], ['josh', 'testemail@gmail.com', 'yo'], ['austin', 
'testemail@gmail.com', 'cya']

df = pd.DataFrame(data, columns = ['Pending Action From', 'Email', 
'values'])



all_names = email_list['Pending Action From']
all_emails = email_list['Email']
all_values = email_list['values']

# Takes duplicate row based off same name
duplicateRowsDF = email_list[email_list.duplicated(['Pending Action From'])]
duplicateRowsDF.to_excel('DuplicateQARList.xlsx', index=False)
duplicateRowsDF = pd.read_excel('DuplicateQARList.xlsx')

# removes duplicate row based off same name and keeps first same name
email_list.drop_duplicates(subset='Pending Action From', keep="first", inplace=True)

# email_list.to_excel('TestQARList.xlsx')


duplicate_values = duplicateRowsDF['value']
duplicate_names = duplicateRowsDF['Pending Action From']

 # Create the email to send
def full_email():
    full_email = ("From: {0} <{1}>\n"
                  "To: {2} <{3}>\n"
                  "Cc: {4}\n"
                  "Subject: {5}\n\n"
                  "{6}"
                  .format(your_name, your_email, name, email, cc, 
subject, message))
# Create the email to send

# In the email field, you can add multiple other emails if you want
# all of them to receive the same text
try:
    server.sendmail(your_email, [email], full_email)
    print('Email to {} successfully sent!\n\n'.format(email))
except Exception as e:
    print('Email to {} could not be sent :( because {}\n\n'.format(email, str(e)))


for idx in range(len(email_list)):

    email = all_emails[idx]
    name = all_names[idx]
    value = all_values[idx]

    for i in range(len(duplicateRowsDF)):

        duplicate_value = duplicate_values[i]
        duplicate_name = duplicate_names[i]

        if name == duplicate_name and value != duplicate_value and duplicate_names[i] == duplicate_names[i]:



            message = value, duplicate_values[i]

            full_email()
            email_list = email_list.drop(value)



# Close the smtp server
server.close()

Example of data frames made, it takes duplicates with the same name in the pending action from row and puts those rows in another data frame.

It only took two, but its supposed to take the data from all name values that are duplicate. It sends one email same their are duplicates of the same person, but it takes all the separate values for that one person.

Welcome to stack overflow! Can you provide a [mcve] including a sample of your input dataframe so that we can better understand your issue? It seems like you migt be able to use a groupby but it's difficult to say — G. Anderson, Mar 24 '20 at 21:08
Thanks for the feedback, I added some pictures let me know if that helps. Thanks! — YellowJacket, Mar 25 '20 at 15:17
Thanks for updating, however it's important to include sample data in the text of your question, not as a picture or a link. We can't reproduce or test against a picture of data. See [How to make good pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) — G. Anderson, Mar 25 '20 at 15:20
Got it, should include everything. I just made a copy of the dataframe in the file. Thanks! — YellowJacket, Mar 26 '20 at 15:13
I'm still not clear on your task, but would it be useful to use `df.groupby(['Pending Action From','Email']).agg(list)` to get all `'values'` grouped by name/email, then take action? Then you don't have any duplicated names to worry about, and you can `pop` each row and process it to send email...? — G. Anderson, Mar 26 '20 at 15:26
I will have to look into groupby because I've never used that before, but it sounds like that would work. I will look into it and test it out, very much appreciated. — YellowJacket, Mar 26 '20 at 15:40
As an example, on your test df the output of the above is (in oart): `bill testemail@gmail.com [hey] bob testemail@gmail.com [howdy, hola, hello]` — G. Anderson, Mar 26 '20 at 15:42
After making group_list = email_list.groupby(['Pending Action From', 'Email']).agg(list) I tried doing all_names = group_list['Pending Action From'] and so on, but that gives an error. So I did new_list = pd.DataFrame(group_list, columns= ['Pending Action From', 'Email', 'values']). For some reason it shows two colums with "pending action from' with nan in all the cells. — YellowJacket, Mar 26 '20 at 17:06
If it's a `keyerror`, it's because your `groupby` makes the columns into a multiindex. Try including the `as_index=False` parameter, as in `df.groupby(['Pending Action From','Email'], as_index=False).agg(list)` — G. Anderson, Mar 26 '20 at 17:14
Added as answer fr record-keeping purposes. Feel free to accept it if you like — G. Anderson, Mar 26 '20 at 17:29

score 0 · Accepted Answer · answered Mar 26 '20 at 17:28

In this case, it appears you want to make a list of all row values for each, so rather than process them one-by-one and drop them, you can aggregate them and use pop if you want to remove each, or just operate on the grouped dataframe

df.groupby(['Pending Action From','Email'], as_index=False).agg(list)

    Pending Action From Email   values
0   austin  testemail@gmail.com [cya]
1   bill    testemail@gmail.com [hey]
2   bob testemail@gmail.com [howdy, hola, hello]
3   joe testemail@gmail.com [hi]
4   josh    testemail@gmail.com [yo]

Using .drop with Pandas to drop a single row by a value

1 Answers1