0

I have a data frame that looks like this:

fips year pollutant nonattainment
72137 1992 Sulfur Dioxide (1971)
72137 1992 PM-2.5 (1997) P
72137 1992 8-Hour Ozone (2015) W
72137 1992 'Nitrogen Dioxide (1971)'
72137 1993 Sulfur Dioxide (1971)
72137 1993 PM-2.5 (1997)
72137 1993 8-Hour Ozone (2015) W
72137 1993 'Nitrogen Dioxide (1971)'

FYI:

  • The nonattainment column has the value P or W
  • The pollutant contains value within this list ['PM-2.5 (1997)', 'PM-2.5 (2006)', 'PM-10 (1987)', 'PM-2.5 (2012)'].

Task:

I now want to add a new column called nonattainment_pm, which should contain the value 1; if for any unique fips-year combination,

Expected output:

i.e. the new data frame should look like this:

fips year pollutant nonattainment nonattainment_pm
72137 1992 Sulfur Dioxide (1971) 1
72137 1992 PM-2.5 (1997) P 1
72137 1992 8-Hour Ozone (2015) W 1
72137 1992 'Nitrogen Dioxide (1971)' 1
72137 1993 Sulfur Dioxide (1971)
72137 1993 PM-2.5 (1997)
72137 1993 8-Hour Ozone (2015) W
72137 1993 'Nitrogen Dioxide (1971)'
Mario
  • 1,631
  • 2
  • 21
  • 51
futur3boy
  • 7
  • 4
  • What have you tried? Have you checked this post: [Creating a new column based on if-elif-else condition](https://stackoverflow.com/q/21702342/10452700)? Kindly, Please Google or check previous similar questions to avoid duplication. You didn't include what you have tried and which error you have faced so far!! – Mario Jul 29 '23 at 15:32
  • Obviously I looked at that answer and also googled my problem, if I had found a suitable solution I wouldn't have asked here. – futur3boy Jul 29 '23 at 16:07

1 Answers1

0

here is one way :

pollutant_l = ['PM-2.5 (1997)', 'PM-2.5 (2006)', 'PM-10 (1987)', 'PM-2.5 (2012)']

df['nonattainment_pm'] = np.where((df['pollutant'].isin(pollutant_l)) & (df['nonattainment'].isin(['P', 'W'])), 1, 0)
df['nonattainment_pm'] = df.groupby(['fips', 'year'])['nonattainment_pm'].transform('max')

output:

    fips  year                  pollutant nonattainment  nonattainment_pm
0  72137  1992      Sulfur Dioxide (1971)           NaN                 1
1  72137  1992              PM-2.5 (1997)             P                 1
2  72137  1992        8-Hour Ozone (2015)             W                 1
3  72137  1992  'Nitrogen Dioxide (1971)'           NaN                 1
4  72137  1993      Sulfur Dioxide (1971)           NaN                 0
5  72137  1993              PM-2.5 (1997)           NaN                 0
6  72137  1993        8-Hour Ozone (2015)             W                 0
7  72137  1993  'Nitrogen Dioxide (1971)'           NaN                 0
eshirvana
  • 23,227
  • 3
  • 22
  • 38