0

I am trying to update certain rows in a column based on a set of conditions. For instance, in the given example below I am trying to update "Country" column names to a shorter version based on a few if statements and here is what I am using. Is there a better way to do this?

energy['Country'] = energy['Country'].apply(lambda x: 'South Korea' if x=='Republic of Korea' 
                        else('United States' if x=='United States of America20' 
                        else('United Kingdom' if x=='United Kingdom of Great Britain and Northern Ireland'
                        else('Hong Kong' if x=='China, Hong Kong Special Administrative Region' 
                             else x))))
ferhen
  • 653
  • 1
  • 6
  • 16
  • It's almost always better to use a built-in instead of iterating and using apply. Rather than apply, use [dataframe.replace](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.replace.html) with a dictionary of items to replace – G. Anderson Feb 11 '20 at 21:07
  • https://stackoverflow.com/questions/19226488/change-one-value-based-on-another-value-in-pandas, https://stackoverflow.com/questions/34962104/pandas-how-can-i-use-the-apply-function-for-a-single-column – AMC Feb 11 '20 at 21:50

2 Answers2

2

Using pd.Series.map

country_map = {'Republic of Korea': 'South Korea',
               'United States of America20': 'United States of America',
               'United Kingdom of Great Britain and Northern Ireland': 'United Kingdom',
               'China, Hong Kong Special Administrative Region': 'Hong Kong'}

energy['Country'] = energy['Country'].map(country_map)
Yuna A.
  • 149
  • 6
0

As much as you can, avoid DataFrame.apply which is a hidden loop. Consider vectorized processing such as numpy.select where you pass vectors (i.e., Numpy arrays or Pandas Series) into a method and not scalar elements one at a time:

energy['Country'] = np.select([energy['Country'] == 'South Korea', 
                               energy['Country'] == 'United States', 
                               energy['Country'] == 'United Kingdom', 
                               energy['Country'] == 'Hong Kong'],
                              ['Republic of Korea', 
                               'United States of America', 
                               'United Kingdom of Great Britain and Northern Ireland'
                               'China, Hong Kong Special Administrative Region'])
Parfait
  • 104,375
  • 17
  • 94
  • 125