I have a pandas dataframe with a format exactly like the one in this question and I'm trying to achieve the same result. In my case, I am calculating the fuzz-ratio between the row's index and it's corresponding col.
If I try this code (based on the answer to the linked question)
def get_similarities(x):
return x.index + x.name
test_df = test_df.apply(get_similarities)
the concatenation of the row index and col name happens cell-wise, just as intended. Running type(test_df)
returns pandas.core.frame.DataFrame
, as expected.
However, if I adapt the code to my scenario like so
def get_similarities(x):
return fuzz.partial_ratio(x.index, x.name)
test_df = test_df.apply(get_similarities)
it doesn't work. Instead of a dataframe, I get back a series (the return type of that function is an int
)
I don't understand why the two samples would not behave the same nor how to fix my code so it returns a dataframe, with the fuzzy.ratio
for each cell between the a row's index for that cell and the col name for that cell.