I have a dataframe(df1) with 3 columns fname,lname,zip.
fname lname zip
ty zz 123
rt kk 345
yu pp 678
another master_df with only a list of zip_codes.
zip_codes
123
345
555
667
I want to write a pyspark sql code to check if zip-codes present in df1 are the ones mentioned in master list. Whichever is not present in master should go into another dataframe.
I tried :
df3 = df1.filter(df1["zip"]!=master["zip_codes"])
My required output_df should show 678 as its not present in master_df