I will have a large list of emails that I will need to compare regularly to a small (20 to 30 entries) list of domains updated by a non-python user, likely in an .xls or .txt or .csv file. Any domains listed in this external file will need to be removed from the list. General tips on setting this up? I already know how to loop over the emails and remove any matches, but I'm less confident on the best way to reference the external file. Thanks so much.
Asked
Active
Viewed 73 times
1 Answers
0
I'd approach it by using Pandas to read the file, with read_csv you can open different types of files that separate values using delimiters (such as commas in a csv), this will return a Pandas Dataframe that you could use to compare with the list of files that you already have.
Pro tip: you probably want to store the list of emails that you already have somewhere, right? if you store them as a csv you can also read them using Pandas. After doing that you can remove occurences following the answer on Diff between two dataframes in pandas
Happy coding!

Kevin Islas
- 49
- 2
-
1`pandas` here seems like using a sledgehammer to swat a fly – juanpa.arrivillaga Nov 07 '18 at 20:16
-
-
if the list of emails was in the thousands, say 5K to 10K long, would that make this a better candidate for pandas? – LML Nov 08 '18 at 23:12