0

I have two (address) columns in two different dataframes, each column having a different length and i wish to iterate each element from one column of a dataframe w.r.t the other column of the other dataframe. Meaning, I wish to check if every element in first column of first dataframe, matches with any of the elements of the second column of the second dataframe and return a boolean value.

How do I implement the above in python?

Dataframe 1:

0 New Delhi, India
1 Mumbai, India
2 Bangalore, India
3 Dwarka, New Delhi, India

Dataframe 2:

0 Nepal
1 Assam, India
2 Delhi

Result: (length should be equal to len of col 1 of df 1)

True
False
False
True
  • 2
    Hi. Please take the time to read this post on [how to provide a great pandas example](http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) as well as how to provide a [minimal, complete, and verifiable example](http://stackoverflow.com/help/mcve) and revise your question accordingly. These tips on [how to ask a good question](http://stackoverflow.com/help/how-to-ask) may also be useful. – jezrael Feb 04 '19 at 08:04
  • Could you tell me what did you not understand in the question? – Deepankar Garg Feb 04 '19 at 08:09
  • I cannot see any data, so main problem [minimal, complete, and verifiable example](http://stackoverflow.com/help/mcve) is missing. – jezrael Feb 04 '19 at 08:10
  • Check this question https://stackoverflow.com/questions/9542738/python-find-in-list – vmaroli Feb 04 '19 at 08:11
  • I have added an example. Hope it will make more sense now! – Deepankar Garg Feb 04 '19 at 08:18
  • So `Dataframe 1` have always values with one letter like `A,B,F,H,a,v,M`? – jezrael Feb 04 '19 at 08:23
  • Not necessary. It could have any length. Consider the columns in both dataframes like 'residential address'. – Deepankar Garg Feb 04 '19 at 08:24
  • 1
    I'm afraid this still seems too far abstracted from your real data. Are you really looking for single-character matches, or substring matches, or something else entirely? – tripleee Feb 04 '19 at 08:32
  • Like i said in the previous comment, consider the two columns as 'address' of multiple places. – Deepankar Garg Feb 04 '19 at 08:33
  • I have further updated the example! – Deepankar Garg Feb 04 '19 at 09:37

1 Answers1

1
import pandas as pd
sales1 = [{'account': 'Jones LLC', 'Jan': 150, 'Feb': 200, 'Mar': 140},
     {'account': 'Alpha Co',  'Jan': 200, 'Feb': 210, 'Mar': 215},
     {'account': 'Blue Inc',  'Jan': 50,  'Feb': 90,  'Mar': 95 }]

sales2 = [{'account': 'Jones LLC', 'Jan': 150, 'Feb': 200, 'Mar': 140},
     {'account': 'A',  'Jan': 200, 'Feb': 210, 'Mar': 215},
     {'account': 'S',  'Jan': 50,  'Feb': 90,  'Mar': 95 }]

df1 = pd.DataFrame(sales1)
df2 = pd.DataFrame(sales2)

def CheckDF(df1,df2):
    for (item, Value),(item1, Value1) in 
    zip(df1['account'].iteritems(),df2['account'].iteritems()):
        if len(str(Value).strip()) == len(str(Value1).strip()):
            print(True)
        else:
            print(False)

CheckDF(df1,df2)

DF1:

   Feb  Jan  Mar    account
0  200  150  140  Jones LLC
1  210  200  215   Alpha Co
2   90   50   95   Blue Inc

DF2:

   Feb  Jan  Mar    account
0  200  150  140  Jones LLC
1  210  200  215          A
2   90   50   95          S

Output:

True
False
False
Anonymous
  • 659
  • 6
  • 16
  • Thanks for the answer Suraj. But what i am looking for is NOT length matching of two different strings, but rather the matching of the contents of the strings. If the addresses in df1 and df2 match, it should return True/False accordingly. – Deepankar Garg Feb 04 '19 at 08:38
  • This is basically it, just needs to replace `if len(str(Value).strip()) == len(str(Value1).strip()):` with `if str(Value).strip() in str(Value1).strip():`. Interesting to see if someone else has a vectorised solution. +1 – Josh Friedlander Feb 04 '19 at 08:42
  • I will try this, though i think 'in' compares row-wise, i.e. row[0] of df1 in row[0] of df2. It won't compare with all rows of df2. – Deepankar Garg Feb 04 '19 at 08:44
  • I tried it, its not working correctly. I used the same example as Suraj illustrated, with just changing "Jones LLC" from first position to the last position in one of the dataframe, it fails to identify its existence. – Deepankar Garg Feb 04 '19 at 09:04