1

I am trying to map or replace values in a Pandas data frame based on value pairs in a dictionary. The value pairs are in a list, and I want to split the items into their own columns. There are always only two items in the value pair list.

Stand-alone example

mydf = pd.DataFrame({"x": [1, 2, 3, 4], "cat": ["text1", "text3", "text1", "text4"]})

mydict = {
    "text1": ["a15", "b3"],
    "text2": ["a6", "b1"],
    "text3": ["a3", "b3"],
    "text4": ["a1", "b12"],
}

display(mydf)

enter image description here

Desired result

enter image description here

Efforts

Based on a similar question here I tried the following:

new_map = {str(x): str(k) for k, v in mydict.items() for x in v}
mydf['left'] = mydf['cat']
mydf['left'].map(new_map)

Which doesn't work, it returns this:

enter image description here

I also tried the following (based on this):

new_map = {str(x): str(k) for k, v in mydict.items() for x in v}
mydf['left'] = mydf.index.map(lambda x: new_map[x])

Which resulted in the error KeyError: 0

a11
  • 3,122
  • 4
  • 27
  • 66

4 Answers4

3

this can be done by merge.

  • convert dict to a list
  • merge by cat key
mylist = [[key,mydict[key][0],mydict[key][1]] for key in mydict]
df2 = pd.DataFrame(mylist,columns=['cat','left','right'])
df2

df = pd.merge(mydf,df2,on='cat',how='left')
    x   cat     left    right
0   1   text1   a15     b3
1   2   text3   a3      b3
2   3   text1   a15     b3
3   4   text4   a1      b12
nay
  • 1,725
  • 1
  • 11
  • 11
2

Create a separate dataframe from the mapping and join back to the original dataframe via a join, or concat:

    mapped = (mydf.cat
                  .map(mydict)
                  .apply(pd.Series)
                  .set_axis(['left', 'right'], axis='columns'))
    
      left right
    0  a15    b3
    1   a3    b3
    2  a15    b3
    3   a1   b12

Join back to the main df

 mydf.join(mapped)

   x    cat left right
0  1  text1  a15    b3
1  2  text3   a3    b3
2  3  text1  a15    b3
3  4  text4   a1   b12

You can skip the verbose option above with this solution from @HenryEcker:

mydf[['left', 'right']] = mydf.cat.map(mydict).apply(pd.Series)
sammywemmy
  • 27,093
  • 4
  • 17
  • 31
2

You can use merge. Take advantage of the fact that not much is needed to transform mydict into a dataframe:

mydf.merge(pd.DataFrame(mydict,
                        index=['left', 'right']).T,
           left_on='cat',
           right_index=True,
           how='left', sort=False
          )

output:

   x    cat left right
0  1  text1  a15    b3
1  2  text3   a3    b3
2  3  text1  a15    b3
3  4  text4   a1   b12
mozway
  • 194,879
  • 13
  • 39
  • 75
2

We can use pd.DataFrame.from_dict to convert mydict directly to a DataFrame then join to mydf:

mydf = mydf.join(
    pd.DataFrame.from_dict(mydict, orient='index', columns=['left', 'right']),
    on='cat'
)

mydf:

   x    cat left right
0  1  text1  a15    b3
1  2  text3   a3    b3
2  3  text1  a15    b3
3  4  text4   a1   b12
Henry Ecker
  • 34,399
  • 18
  • 41
  • 57