0

I am running a selenium script which captures the office location and employee name and saves to a dataframe.

Employee   Office
John Doe   Building One
Kim Joe    Building One
Harry P   Building two
Harry P   Building Three

CSV
   Employee   Office
    Kim Joe   Building One
    Harry P   Building two
    Harry P   Building Three

my code is below

df2 = df2.append({'Employee': emp.text,'Office': loc}, ignore_index=True)

I am reading the csv file into dataframe df1 and trying to compare the two columns in two dataframes, I tried the below code but it wont work because the indexes might be different

df2[df1.ne(self.df2).any(axis=1)]

What I am trying to do is want to get the employee:office which exist in df2 but not in df1, I am not sure how merge works, tried but didnt get the desired result, also tried using dictionary instead of dataframes but I guess it didnt work because of non unique keys, also open to other ways

Output
John Doe   Building One
Ronron
  • 69
  • 1
  • 2
  • 11
  • 1
    Have you looked at [Pandas Merging 101](https://stackoverflow.com/questions/53645882/pandas-merging-101)? Please also see [Pandas: how to ask](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) – ddejohn Sep 22 '21 at 01:51
  • 1
    `df2.loc[~df2['Employee'].isin(df['Employee'])]` – Chris Sep 22 '21 at 01:51
  • @Chris this wont work because this code will check employee names only, in my scenario if 'Harry P' 'Building two' is missing from file 2 then it wont be printed because Harry P exist in file. the solution should be checking both the employee name and location – Ronron Sep 22 '21 at 02:14
  • @ddejohn I am finding it hard to understand merges but will give it a try again, thanks – Ronron Sep 22 '21 at 02:16

1 Answers1

1

Try using np.isin:

>>> df2[~np.isin(df2, df).all(axis=1)]
   Employee        Office
0  John Doe  Building One
>>> 
U13-Forward
  • 69,221
  • 14
  • 89
  • 114
  • Solution 1 works, solution 2 does not work for me because if 'Harry P' 'Building two' is missing in file 2 it wont be caught because we are looking through the employee names only. The solution must be based of checking employee and location, your first solution works, thanks – Ronron Sep 22 '21 at 02:11
  • @Ronron Removed the solution 2, please accept and upvote if it works :) – U13-Forward Sep 22 '21 at 02:12
  • @Ronron Read https://stackoverflow.com/help/someone-answers, it's helping the community :) – U13-Forward Sep 22 '21 at 02:13