0

I have a data frame I created from an imported CSV file in python, following a fuzzy match logic I created the below, but I'm stuck on how I can apply the logic to column in a data frame:

import pip
import pandas as pd
Str1 = "Los Angeles Lakers"
Str2 = "Lakers"
Partial_Ratio = fuzz.partial_ratio(Str1.lower(),Str2.lower())
print(Partial_Ratio)

but I want to be able to apply the same logic to 2 columns in a data frame, I tried the below but no joy.

import pip
import pandas as pd
df = pd.read_csv(r'C:\Users\lozza\Documents\Work\Python_Packages\biz.csv')
pip.main(['install','fuzzywuzzy','fuzz'])
from fuzzywuzzy import fuzz
df1 = pd.DataFrame(df)
df1['Partial_Ratio'] = (fuzz.partial_ratio(df1['item_desc'].lower(),df1['desc'].lower()))
print(df1[['item','item_desc', 'desc','Partial_Ratio']])

1 Answers1

0

If you want to compare the strings of each row against each other, I would just iterate through the dataframe. Otherwise, you can use process to compare a query against a list (I don't think that is your intent here).

import pandas as pd
df = pd.DataFrame({"string1" : ['Los Angeles Lakers', 'Apple Inc.', 'toys'], 
                   "string2" : ['Lakers', 'apple inc', 'toyota']})
df['Partial_Ratio'] = None

#print(df)
'''
    string1              string2        Partial_Ratio
0   Los Angeles Lakers   Lakers         None
1   Apple Inc.           apple inc      None
2   toys                 toyota         None
'''

# something like this
for i in range(len(df['string1'])):
      df['Partial_Ratio'][i] = fuzz.partial_ratio(df['string1'][i].lower(),df['string2'][i].lower())

#print(df)
 '''
        string1              string2        Partial_Ratio
    0   Los Angeles Lakers   Lakers         100
    1   Apple Inc.           apple inc      100
    2   toys                 toyota         75
    '''
Colin
  • 51
  • 5
  • Hi - What about if the dataframe is a CSV? i.e. looping through the values in the column of a CSV import, with the same output as you describe. – Lawrence Giordano Jan 26 '21 at 10:07