16

I have a dataframe and 2 lists.

the 1st list gives a set of index values from the dataframe I want to replace

the 2nd list gives the values I want to use

I don't want to touch any of the other values

Here is the dataframe:

df =  pd.DataFrame.from_dict({u'Afghanistan': 6532.0,
 u'Albania': 662.0,
 u'Andorra': 2.0,
 u'Angola': 2219.0,
 u'Antigua and Barbuda': 0.0,
 u'Argentina': 6.0,
 u'Armenia': 15.0,
 u'Australia': 108.0,
 u'Azerbaijan': 210.0,
 u'Bahamas': 0.0,
 u'Bahrain': 6.0,
 u'Bangladesh': 5098.0,
 u'Barbados': 0.0,
 u'Belarus': 21.0,
 u'Belize': 0.0,
 u'Benin': 4244.0,
 u'Bhutan': 418.0,
 u'Bolivia (Plurinational State of)': 122.0,
 u'Bosnia and Herzegovina': 43.0,
 u'Botswana': 2672.0,
 u'Brazil': 36.0,
 u'Brunei Darussalam': 42.0,
 u'Bulgaria': 46.0,
 u'Burkina Faso': 6074.0,
 u'Burundi': 18363.0,
 u'Cabo Verde': 2.0,
 u'Cambodia': 12237.0,
 u'Cameroon': 14629.0,
 u'Canada': 206.0,
 u'Central African Republic': 3207.0,
 u'Chad': 3546.0,
 u'Chile': 0.0,
 u'China': 71093.0,
 u'Colombia': 1.0,
 u'Congo': 1678.0,
 u'Cook Islands': 2.0,
 u'Costa Rica': 0.0,
 u'Croatia': 9.0,
 u'Cuba': 0.0,
 u'Cyprus': 0.0,
 u'Czechia': 9.0,
 u"C\xf4te d'Ivoire": 5729.0,
 u'Democratic Republic of the Congo': 8282.0,
 u'Denmark': 14.0,
 u'Djibouti': 183.0,
 u'Dominica': 0.0,
 u'Dominican Republic': 253.0,
 u'Ecuador': 0.0,
 u'Egypt': 2633.0,
 u'El Salvador': 0.0,
 u'Eritrea': 789.0,
 u'Estonia': 9.0,
 u'Ethiopia': 1660.0,
 u'France': 10000.0,
 u'Gabon': 15.0,
 u'Gambia': 336.0,
 u'Georgia': 50.0,
 u'Ghana': 23068.0,
 u'Greece': 56.0,
 u'Grenada': 0.0,
 u'Guatemala': 0.0,
 u'Guinea': 11294.0,
 u'Guyana': 0.0,
 u'Haiti': 992.0,
 u'Honduras': 0.0,
 u'Hungary': 1.0,
 u'Iceland': 0.0,
 u'India': 38835.0,
 u'Indonesia': 3344.0,
 u'Iran (Islamic Republic of)': 11874.0,
 u'Iraq': 726.0,
 u'Israel': 36.0,
 u'Italy': 1457.0,
 u'Jamaica': 0.0,
 u'Japan': 22497.0,
 u'Jordan': 32.0,
 u'Kazakhstan': 245.0,
 u'Kenya': 21002.0,
 u'Kiribati': 0.0,
 u'Kuwait': 6.0,
 u'Kyrgyzstan': 16.0,
 u"Lao People's Democratic Republic": 332.0,
 u'Latvia': 0.0,
 u'Lebanon': 5.0,
 u'Lesotho': 660.0,
 u'Liberia': 5977.0,
 u'Lithuania': 19.0,
 u'Luxembourg': 0.0,
 u'Madagascar': 35256.0,
 u'Malawi': 304.0,
 u'Malaysia': 6187.0,
 u'Maldives': 20.0,
 u'Mali': 1578.0,
 u'Malta': 2.0,
 u'Marshall Islands': 0.0,
 u'Mauritius': 0.0,
 u'Mexico': 30.0,
 u'Micronesia (Federated States of)': 0.0,
 u'Mongolia': 925.0,
 u'Morocco': 7368.0,
 u'Mozambique': 7375.0,
 u'Myanmar': 845.0,
 u'Namibia': 469.0,
 u'Nauru': 0.0,
 u'Nepal': 9397.0,
 u'Netherlands': 1019.0,
 u'New Zealand': 65.0,
 u'Nicaragua': 0.0,
 u'Niger': 21319.0,
 u'Nigeria': 212183.0,
 u'Niue': 0.0,
 u'Norway': 0.0,
 u'Oman': 15.0,
 u'Pakistan': 2064.0,
 u'Palau': 0.0,
 u'Panama': 0.0,
 u'Papua New Guinea': 7135.0,
 u'Paraguay': 0.0,
 u'Peru': 1.0,
 u'Philippines': 7120.0,
 u'Poland': 77.0,
 u'Portugal': 45.0,
 u'Qatar': 46.0,
 u'Republic of Korea': 32647.0,
 u'Republic of Moldova': 687.0,
 u'Romania': 35.0,
 u'Russian Federation': 4800.0,
 u'Rwanda': 2095.0,
 u'Saint Kitts and Nevis': 0.0,
 u'Saint Lucia': 0.0,
 u'Saint Vincent and the Grenadines': 0.0,
 u'San Marino': 1.0,
 u'Sao Tome and Principe': 0.0,
 u'Senegal': 5839.0,
 u'Serbia': 38.0,
 u'Sierra Leone': 3575.0,
 u'Singapore': 141.0,
 u'Slovakia': 0.0,
 u'Somalia': 3965.0,
 u'South Africa': 1459.0,
 u'Spain': 152.0,
 u'Sri Lanka': 16527.0,
 u'Sudan': 2875.0,
 u'Suriname': 0.0,
 u'Swaziland': 10.0,
 u'Sweden': 59.0,
 u'Syrian Arab Republic': 146.0,
 u'Tajikistan': 192.0,
 u'Thailand': 4074.0,
 u'The former Yugoslav republic of Macedonia': 36.0,
 u'Togo': 3578.0,
 u'Tonga': 0.0,
 u'Trinidad and Tobago': 0.0,
 u'Tunisia': 47.0,
 u'Turkey': 16244.0,
 u'Turkmenistan': 113.0,
 u'Uganda': 42554.0,
 u'Ukraine': 817.0,
 u'United Arab Emirates': 69.0,
 u'United Kingdom of Great Britain and Northern Ireland': 104.0,
 u'United Republic of Tanzania': 14649.0,
 u'United States of America': 85.0,
 u'Uruguay': 0.0,
 u'Uzbekistan': 80.0,
 u'Vanuatu': 9.0,
 u'Venezuela (Bolivarian Republic of)': 22.0,
 u'Viet Nam': 16512.0,
 u'Zambia': 30930.0,
 u'Zimbabwe': 1483.0}, orient = 'index')

Here is the 1st list:

list1 = [u'Bolivia (Plurinational State of)', u'Brunei Darussalam', u'Cabo Verde', u'China',
    u'Congo', u'Cook Islands', u'Czechia', u"C\xf4te d'Ivoire", 
    u"Democratic People's Republic of Korea", u'France', u'Iran (Islamic Republic of)', 
    u"Lao People's Democratic Republic", u'Micronesia (Federated States of)', u'Niue', 
    u'Republic of Korea', u'Republic of Moldova', u'Russian Federation', u'Sao Tome and Principe', 
    u'Serbia', u'Somalia', u'Syrian Arab Republic', u'The former Yugoslav republic of Macedonia', 
    u'United Kingdom of Great Britain and Northern Ireland', u'United Republic of Tanzania', 
    u'United States of America', u'Venezuela (Bolivarian Republic of)', u'Viet Nam']

Here is the 2nd list

list2 = [u'Bolivia', u'Brunei', u'Cape Verde', u'China[1]', u'Democratic Republic of the Congo', 
    u'Cook Islands (NZ)', u'Czech Republic', u'Ivory Coast', u'North Korea', u'France[2]', 
    u'Iran', u'Laos', u'Federated States of Micronesia', u'Niue (NZ)', u'South Korea', 
    u'Moldova[3]', u'Russia', u'S\xe3o Tom\xe9 and Pr\xedncipe', u'Serbia[5]', 
    u'Somalia[6]', u'Syria', u'Macedonia', u'United Kingdom', u'Tanzania', 
    u'United States', u'Venezuela', u'Vietnam']

This is clearly the sort of thing python excels at - and I suspect a simple for loop will do it but I can't quite wrap my head around the logic (yet)

Any help gratefully appreciated!

James
  • 32,991
  • 4
  • 47
  • 70
kiltannen
  • 1,057
  • 2
  • 12
  • 27
  • Not sure what has to be replaced where? – cs95 Apr 21 '18 at 02:22
  • You could try to use the replace function in pandas. https://stackoverflow.com/questions/27060098/replacing-few-values-in-a-pandas-dataframe-column-with-another-value – inspired_learner Apr 21 '18 at 02:25
  • In the dataframe, some of the index values are not what I want. The 1st list identifies which index values I want to change, the second list identifies the values I want to change them to. Same number of items in each list - and their positions match. – kiltannen Apr 21 '18 at 02:32

3 Answers3

24

Use,

df = df.rename(index=dict(zip(list1,list2)))
Scott Boston
  • 147,308
  • 15
  • 139
  • 187
  • 2
    Just brilliant! Exactly what I wanted to do - and no looping involved at all! So Easy when you know how! THANK YOU – kiltannen Apr 21 '18 at 02:37
8

zip the two lists to create a dictionary that maps old names to the new names.

use the function pandas.DataFrame.rename with with the replacements dictionary and all other default arguments

replacements = {l1:l2 for l1, l2 in zip(list1, list2)}

df2 = df.rename(replacements)
Haleemur Ali
  • 26,718
  • 5
  • 61
  • 85
0

I believe there's an easier way now: pandas.DataFrame.set_index()

Usage:

df.set_index(list1)

OR

# Use this if you wanna assing one of the existing DataFrame columns as Index
df.set_index(df_column_id)
Ariel Lubonja
  • 83
  • 1
  • 10