0

I have the following function:

def match_function(df):

    columns = df.columns.tolist()
    matches = {}

    for column in columns:
        df = df.astype(str)
        df_new = df.dropna()
        df_new = df[column].str.split(',', expand=True)
        df_new = df_new.apply(lambda s: s.value_counts(), axis=1).fillna(0)
        match = df_new.iloc[:, 0][0] / df_new.sum(axis=1) * 100
        match = round(match, 2)

        df[column] = match
        matches[column] = match

    return matches

I want this function to run completely separately for each row of the dataframe. It will loop through first row of dataframe, then will stop and will run again for the second row and etc.

Because it's written in a such complex and unprofessional way(as I'm new to Python), the result is wrong when I pass a dataframe and it runs through the whole dataframe simultaneously. Or maybe change the function itself somehow, so it would go only row by row

Archi
  • 79
  • 8
  • @YevhenKuzmovych maybe. I need to somehow ajust this function, and I'm not sure how to do it – Archi Oct 05 '21 at 08:51

1 Answers1

0

Consider the following df:

          a         b
0  1.000000  0.000000
1 -2.000000  1.000000
2  1.000000  0.000000
3  3.000000 -4.000000

And the following function, named "func".

def func(x):    
   return x['a'] + x['b'] 

You can apply that function on a row-basis with :

df.apply(func, axis=1)

Than yields:

0    1.000000
1   -1.000000
2    1.000000
3   -1.000000

So basically, for every row, we applied the named function func() , which is x['a'] + x['b']

BlackMath
  • 1,708
  • 1
  • 11
  • 14