0

I have an Excel-File with two sheets. One contains the df1:

Country City Population Planet
Germany  Berlin 30500    Earth
Spain    Madrid  21021   Earth
...

And the second contains the df2:

Country   City   Population Planet
Spain    Madrid  21021   Earth
...

Now I want to compare the two dataframes and check if there are rows in df1 which are also in df2 and if yes then: I want to add a new column to df1 which has the name double and just want to put an "X" if the row is in df1 and in df2.

JDog
  • 29
  • 3
  • You could use [pandas](https://pandas.pydata.org/) to read both excel sheets as suggested in [this answer](https://stackoverflow.com/questions/26521266/using-pandas-to-pd-read-excel-for-multiple-worksheets-of-the-same-workbook). Read sheet2 into a list, and - while iterating over rows in sheet1 - you could check if this row exists in your list and append that information accordingly. – trotta May 09 '19 at 09:58

1 Answers1

0
# create string data

df1_str = '''Country,City,Population,Planet
Germany,Berlin,30500,Earth
Spain,Madrid,21021,Earth'''

df2_str = '''Country,City,Population,Planet
Spain,Madrid,21021,Earth'''
# read in to dataframe
df1 = pd.read_csv(io.StringIO(df1_str))
# read in to list for iteration
df1_list = pd.read_csv(io.StringIO(df1_str)).values.tolist()
df2_list = pd.read_csv(io.StringIO(df2_str)).values.tolist()

# join all columns and make a unique combination
df1_list = ["-".join(map(str, item)) for item in df1_list]
df2_list = ["-".join(map(str, item)) for item in df2_list]

# check the combinations exist in both data frame
common_flag = []
for item1 in df1_list:
    for item2 in df2_list:
        if item1 in item2: # you might prefer item1 == item2:
            common_flag.append("X")
        else:
            common_flag.append(None)

# add the result to datagrame df1
df1["double"] = pd.Series(common_flag)

Make sure the column order are same in both dataframe when creating combination list.

output:

   Country    City  Population Planet double
0  Germany  Berlin       30500  Earth   None
1    Spain  Madrid       21021  Earth      X
DataPsycho
  • 958
  • 1
  • 8
  • 28