1

I want to replace all for values (227 rows, 397 columns) in a dataframe less than a certain value (b) with zero and the rest should be the existing value - b . Its a kind of a Baseline correction. I have a solution which works: loop over every value check the condition and replace it.

import pandas as pd
b = 20
    
for index, row in df.iterrows():
    for col in df.columns:
        if df.loc[index, col] <= b:
            df.loc[index, col] = 0.0
        else:
            df.loc[index, col] = df.loc[index, col] - b

The code works but i get this warning from pandas: A value is trying to be set on a copy of a slice from a DataFrame

Is there a better way to do this?

1 Answers1

0

Use numpy.where here with DataFrame constructor for avoid looping and improve performance:

df = pd.DataFrame(np.where(df <= b, 0, df - b), index=df.index, columns= df.columns)

Or subtract values and set 0 by DataFrame.mask:

df = df.sub(b).mask(df <= b, 0)
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252