0

Below are the columns from csv , each and every Country_code match with snowflake table . I have tried with all connections and able to read snowflake data in pandas dataframe but i didn' get how i can check csv file country_code with code in table. If it is match make flag true or if it doesn't match make it false In python pandas.

Country_code | Country
U123_CA      |     USA
C123_TR      |  Canada
I123_MU      |   India
E123_EU      |  Europe

Table data

Code    | Country   | Date 
U123_CA       USA     2021
C123_TR    Canada     2020
I999_MU     India     2022
wjandrea
  • 28,235
  • 9
  • 60
  • 81
Nathon
  • 165
  • 1
  • 4
  • 13
  • What's your desired output? Please [edit] to clarify. For reference see [mre] and [How to make good reproducible pandas examples](/q/20109391/4518341). – wjandrea Aug 20 '23 at 04:56
  • check Each and every country_code available in Table data on code. print if available 'PASS' otherewise fail. – Nathon Aug 20 '23 at 12:36

1 Answers1

1

Left join two DFs and get the flag with notna().

import pandas as pd
from io import StringIO
data1 = """Country_code,Country
U123_CA,USA
C123_TR,Canada
I123_MU,India
E123_EU,Europe"""
data2 = {
    'Code': ['U123_CA', 'C123_TR', 'I999_MU'],
    'Country': ['USA', 'Canada', 'India'],
    'Date': [2021, 2020, 2022]
}
df1 = pd.read_csv(StringIO(data1))
df2 = pd.DataFrame(data2)
df1 = df1.merge(df2, how='left',
   left_on=['Country_code', 'Country'],
   right_on=['Code', 'Country'])
df1['flag'] = df1['Code'].notna()
df1.drop(columns=['Code', 'Date'], inplace=True)
print(df1)

Output

  Country_code Country   flag                                                                 
0      U123_CA     USA   True                                                                 
1      C123_TR  Canada   True                                                                 
2      I123_MU   India  False                                                                 
3      E123_EU  Europe  False    
taller_ExcelHome
  • 2,232
  • 1
  • 2
  • 12
  • for every run file will come and we need to take only contry_code one by one and check with table data (some how we need to use for loop and get single country_code and check). can you please help – Nathon Aug 20 '23 at 13:00
  • Above solution is getting issue with 'can only merge series or DataFrame objects a , was passed – Nathon Aug 20 '23 at 13:41
  • Pandas get it done with more efficient approach. Please provide detail about " take only contry_code one by one and check with table data" in OP. – taller_ExcelHome Aug 20 '23 at 14:22