0

I want to populate a column in pandas data frame of a set length with values from a dictionary using a list comprehension.

The dictionary maps keys that correspond to almost all unique values from an existing column of the data frame to some values. Crucially, not all of the unique values are included in the dictionary. In those cases, I would like to place the pandas-native null value, NaN. This way my new list can be of the same length as the column in the data frame.

I have tried using an if-else structure in the list comprehension, as follows:

df['col_B'] = [d[key] for key in df['col_A'].values if key in d else NaN]

I expect to get a fully populated column with NaN for rows where there was no key-value pair in the dictionary. But I get the following error:

SyntaxError: invalid syntax

I am aware that the error lies in the else part of the statement, but I do not know how to specify that part so that it inserts NaN for the missing key-value pairs.

Here is a toy example that reproduces the error:

# Import pandas library 
import pandas as pd

# create a dictionary
d = {1:'a',2:'b', 3:'c'}  

# create a list
data = [2,1,3,1,4,2,2,1,4,3]

# Create a data drame with list as only column
df = pd.DataFrame(data, columns = ['number']) 

# add new column by populating list with matching dictionary values
df['letter'] = [d[key] for key in df['number'] if key in d else NaN]
Des Grieux
  • 520
  • 1
  • 5
  • 31

1 Answers1

1

You just need map here

df['letter']=df.number.map(d)
df
   number letter
0       2      b
1       1      a
2       3      c
3       1      a
4       4    NaN
5       2      b
6       2      b
7       1      a
8       4    NaN
9       3      c
BENY
  • 317,841
  • 20
  • 164
  • 234