0

I have two dataframes and I want comparying the content of both text column:

DF1:

pd.DataFrame({'text1':['Today is monday','I need money'],'name:['calendar', 'finance']})

  
      text1                    name
Today is monday              calendar
I need money                 finance

DF2

pd.DataFrame({'text2':['My car','Saturday Night','Lost my Wallet']})

  text2                   
My car             
Saturday Night                
Lost my Wallet           

and then I am performing an NLP task that compares df1 and df2 text:

df = pd.DataFrame()
for i in df1['text1']:
    for j in df2['text2']::
        s=nlp(i)
        p=nlp(j)
        out = s.similarity(p)
        df=df.append({'text1':i, 'text2':j, 'index':out}, ignore_index=True)

And the output is:

      text1                       text2              index
Today is monday                My car                  1
Today is monday                Saturday Night          0.23
Today is monday                Lost my wallet          0.40
I need money                   My car                  0.34
I need money                   Saturday Night          0.13
I need money                   LOst my wallet          0.20

The output above works just fine. So I am just comparying the text1 and text2 values. What index means is not important for this problem.

I want to change the output to also add df1['name'] colum but I dont´t know how to do it as my loop is iterating over the text column only

Expected Output:

      text1         name                     text2              index
Today is monday     calendar               My car                  1
Today is monday     calendar               Saturday Night          0.23
Today is monday     calendar               Lost my wallet          0.40
I need money        finance                My car                  0.34
I need money        finance                Saturday Night          0.13
I need money        finance                Lost my wallet          0.20
datashout
  • 147
  • 7

0 Answers0