1

I have a dataframe that looks like this:

iso3    prod_level  alloc_key   cell5m  x   y   rec_type    tech_type   unit    whea_a  ... acof_pct_prod   rcof_pct_prod   coco_pct_prod   teas_pct_prod   toba_pct_prod   bana_pct_prod   trof_pct_prod   temf_pct_prod   vege_pct_prod   rest_pct_prod
35110   IND IN16011 9243059 3990418 74.875000   13.041667   P   A   mt  0.0 ... 1.0 1.0 1.0 1.0 1.0 0.958586    0.449218    1.0 1.0 0.004520
35109   IND IN16011 9243058 3990417 74.791667   13.041667   P   A   mt  0.0 ... 1.0 1.0 1.0 1.0 1.0 0.970957    0.459725    1.0 1.0 0.009037
35406   IND IN16003 9283093 4007732 77.708333   12.708333   P   A   mt  0.0 ... 1.0 1.0 1.0 1.0 1.0 0.883868    1.000000    1.0 1.0 0.012084
35311   IND IN16011 9273062 4003381 75.125000   12.791667   P   A   mt  0.0 ... 1.0 1.0 1.0 1.0 1.0 0.942550    0.381430    1.0 1.0 0.015024
35308   IND IN16011 9273059 4003378 74.875000   12.791667   P   A   mt  0.0 ... 1.0 1.0 1.0 1.0 1.0 0.991871    0.887494    1.0 1.0 0.017878

I want to set all values that are greater than 0.9 in columns that end in 'prod' to zero. I can select only those columns like this:

cols2=[col for col in df.columns if col.endswith('_prod')]
df[cols2]
whea_pct_prod   rice_pct_prod   maiz_pct_prod   barl_pct_prod   pmil_pct_prod   smil_pct_prod   sorg_pct_prod   pota_pct_prod   swpo_pct_prod   cass_pct_prod   ... acof_pct_prod   rcof_pct_prod   coco_pct_prod   teas_pct_prod   toba_pct_prod   bana_pct_prod   trof_pct_prod   temf_pct_prod   vege_pct_prod   rest_pct_prod
35110   1.0 0.958721    0.359063    1.0 1.0 1.000000    1.0 1.0 1.00000 0.992816    ... 1.0 1.0 1.0 1.0 1.0 0.958586    0.449218    1.0 1.0 0.004520
35109   1.0 0.878148    0.200283    1.0 1.0 1.000000    1.0 1.0 1.00000 0.993140    ... 1.0 1.0 1.0 1.0 1.0 0.970957    0.459725    1.0 1.0 0.009037
35406   1.0 0.996354    0.980844    1.0 1.0 0.274348    1.0 1.0 0.99945 1.000000    ... 1.0 1.0 1.0 1.0 1.0 0.883318    1.000000    1.0 1.0 0.012084
35311   1.0 0.570999    0.341217    1.0 1.0 1.000000    1.0 1.0 1.00000 0.997081    ... 1.0 1.0 1.0 1.0 1.0 0.942550    0.381430    1.0 1.0 0.015024
35308   1.0 0.657520    0.161771    1.0 1.0 1.000000    1.0 1.0 1.00000 0.991491    ... 1.0 1.0 1.0 1.0 1.0 0.991871    0.887494    1.0 1.0 0.017878

Now, when I try and set the values greater than 0.9 to be zero, it does not work.

df[cols2][df[cols2]>0.9]=0

What should I be doing instead?

Eli Turasky
  • 981
  • 2
  • 11
  • 28

1 Answers1

4

You can use df.where(cond, other) to replace the values with other where cond == False.

df[cols2] = df[cols2].where(df[cols]<=0.9, other=0)
RichieV
  • 5,103
  • 2
  • 11
  • 24
  • furthermore, read about not using "chained assignments" on this answer[https://stackoverflow.com/a/20627316/6692898] – RichieV Mar 24 '22 at 23:39