Pandas apply method on a column only when conditions are met

Question

I want to use df.apply to a DataFrame column.

The df represents hierarchical biological classifications.

I want to show the relevant classification depending on if the data exists within the column. I have written a function that should do this:

def condition(data):
    for i in range(len(data)):
        if data.G[i] and data.Taxonomy[i]:
            return(data.G[i] + " " +data.Taxonomy[i])
        elif data.G[i] and not data.Taxonomy[i]:
            return(data.G[i])
        elif not data.G[i] and not data.Taxonomy[i]:
            return(data.F[i])
        elif data.O[i] and not data.G[i] and not data.Taxonomy[i]:
            return(data.O[i])
        elif data.C[i] and not data.O[i] and not data.G[i] and not data.Taxonomy[i]:
            return(data.C[i])
        elif data.P[i] and not data.O[i] and not data.G[i] and not data.Taxonomy[i]:
            return(data.P[i])
        elif data.k[i] and not data.P[i] and not data.O[i] and not data.G[i] and not data.Taxonomy[i]:
            return(data.k[i])

I have attempted to apply this function to the dataframe to output an additional column which shows the data after it has gone through condition():

data['name']=data.apply(lambda x: condition(data), axis = 1)

I receive the output of

Output after df.apply

Where the outcome repeats itself instead of applying the function per row.

How can I apply this function so it gives the desired output?

Apply is not the way to go, see first answer in the duplicate link with `np.select` — Erfan, Oct 01 '20 at 14:54

score 0 · Answer 1 · answered Oct 01 '20 at 14:53

0

You must apply lambda on x not the data:

data['name']=data.apply(lambda x: condition(x), axis = 1)

instead of

data['name']=data.apply(lambda x: condition(data), axis = 1)

answered Oct 01 '20 at 14:53

IoaTzimas

10,538
2
13
30

Pandas apply method on a column only when conditions are met

1 Answers1