0

I have a dataframe that looks like this:

destination_zip destination_state
502111387        IA
388588179        MS           
T2A2L9           AB                  
891              AUK     
774653028        TX   

I am trying to write a code that will be adding a new column as destination_country to my dataframe, something like this:

destination_zip destination_state  destination_country
502111387        IA                 US
388588179        MS                 US 
T2A2L9           AB                 CA
891              AUK                NZ
774653028        TX                 US

what I have tried so far is:

df.loc[df['destination_state']=='TX', df['destination_country']]= 'US'
df.loc[df['destination_state']=='IA', df['destination_country']]= 'US'
df.loc[df['destination_state']=='MS', df['destination_country']]= 'US'
df.loc[df['destination_state']=='AUK', df['destination_country']]= 'NZ'
df.loc[df['destination_state']=='AB', df['destination_country']]= 'CA'

but this is not way too long to work with, I wanted something that would be based on multiple conditions in a single line of code, something like this:

df.loc[df['destination_state']=='TX','IA','MS' , df['destination_country']]= 'US'

but this code is not working, can anyone help me with this? My dataframe has 7k rows, that's why I wanted something with multiple conditons. I am using juypter notebook, python-3

Henry Ecker
  • 34,399
  • 18
  • 41
  • 57
Lily
  • 1
  • 1
  • 1
    I would say maybe [Remap values in pandas column with a dict](https://stackoverflow.com/q/20250771/15497888) like `df['destination_country'] = df['destination_state'].map({'TX': 'US', 'IA': 'US', 'MS': 'US', 'AUK': 'NZ', 'AB': 'CA'})` (the dictionary could be programmatically built) or [isin](https://pandas.pydata.org/docs/reference/api/pandas.Series.isin.html) on the condition `df.loc[df['destination_state'].isin(['TX','IA','MS']) , 'destination_country']= 'US'` but the multiple conditions lookups will be slower than map. – Henry Ecker Oct 17 '21 at 17:26
  • 1
    IMO the dictionary is the best approach as you're supposed to have key -> value relationships. Using conditionals would just reimplement a custom search and should be less efficient than the hashed keys of a dictionary. The only case where a conditional might be preferred is if all states are in the same country except one. – mozway Oct 17 '21 at 17:31
  • @mozway can you explain this in detail? – Lily Oct 17 '21 at 17:37
  • [link](https://stackoverflow.com/questions/50355778/is-there-a-python-library-that-would-return-a-city-when-provided-with-zip-code) this is much similar question to mine, but this would return a state from the zip code, is there any way to extract countries using this library? – Lily Oct 17 '21 at 17:49
  • @Lily I provided an example (for the state -> country use case) – mozway Oct 17 '21 at 17:52

1 Answers1

0

Here is how I would proceed:

# this format is easy to maintain
countries = {'US': ['IA', 'MS', 'TX'],
             'CA': ['AB'],
             'NZ': ['AUK'],
             }

# transform it to the inverse
states = {v:k for k,vals in countries.items() for v in vals}

# map states -> country
df['destination_country'] = df['destination_state'].map(states)

output:

  destination_zip destination_state destination_country
0       502111387                IA                  US
1       388588179                MS                  US
2          T2A2L9                AB                  CA
3             891               AUK                  NZ
4       774653028                TX                  US
mozway
  • 194,879
  • 13
  • 39
  • 75
  • thanks, but [example](https://stackoverflow.com/questions/50355778/is-there-a-python-library-that-would-return-a-city-when-provided-with-zip-code) this is what I wanted to ask, as I say there are 7k entries, and I wanted something very effective, your answer is an easy way, but can you guide me with this? – Lily Oct 17 '21 at 18:05
  • 1
    `map` is one of the most efficient replacement operations that exist in pandas. – Henry Ecker Oct 17 '21 at 18:12
  • Well the number of entries is not an issue. You just need to build you dictionary programmatically. Can you give an example of the way the key/values are stored? Or are you looking for a magic method to convert `AB` to `CA` without any other info? – mozway Oct 17 '21 at 18:16