-1

I have a pretty simple problem I could solve just by iterating over rows of a dataframe. But I read it's never a good practice, so I'm wondering how to avoid this step.
Dummy DataFrame
In this example I'd like to automatically give a new name to fruits that are special, according to a conventional rule (as shown in the code below).
This default name should only be applied if the fruit is special and 'Logic name' is still unknown.
In python I would write something like this:

for idx in range(len(a['Fruit'])):
    if df.loc[idx]['Logic name'] == 'unknown' and df.loc[idx]['Special'] == 'yes':
        df.loc[idx]['Logic name'] = df.loc[idx]['color'] + df.loc[idx]['Fruit'][2:]

The final result is this Final Dataframe
How would you avoid iteration in this case?

Mirko
  • 115
  • 1
  • 9
  • Relevant: [Are for-loops in pandas really bad? When should I care?](https://stackoverflow.com/questions/54028199/are-for-loops-in-pandas-really-bad-when-should-i-care) – G. Anderson Mar 04 '22 at 19:56

1 Answers1

0

Use numpy.where with a condition on "special"

import numpy as np
df['Logic name'] = np.where(df['Special'].eq('yes')&df['Logic name'].eq('unknown'),
                            df['color']+df['Fruit'].str[2;],
                            df['Logic name'])
mozway
  • 194,879
  • 13
  • 39
  • 75