Both your definitions raise some kind of error [df
because all columns need to have the same length and d
just had 1 item, and dtd
because ['sv1': 500]
is not valid syntax]. However, if you had
# import pandas as pd
# import numpy as np
df = pd.DataFrame({ 'id': [0, 1, 2], 'nm': ['pn1','pn2','pn3'],
'v': [np.nan, 25, 0],
'd': [{'k1':'v1'},{'k2':'v2'},{'k3':'v3'}] })
dtd = { 'pn1':{'s':100,'v':20, 'sv':{'sv1': 500}},
'pn2':{'s':150,'v':30, 'sv':{'sv1': 400}} }
then you could use something like the function below for extracting nested values [see this question if you're interested in other variations of it]
def getVal(obj, *keys, vDef=None):
try:
for k in keys: obj = obj[k]
except: obj = vDef
return obj
If you want to use .map
or .apply
to create the new column,
sv1 = df['nm'].map(lambda k: getVal(dtd, k, 'sv', 'sv1', vDef=np.nan))
# sv1 = df['nm'].apply(lambda k: getVal(dtd, k, 'sv', 'sv1', vDef=np.nan))
df['v'] = sv1.combine_first(df['v'])
or you can use .combine
directly
df['v'] = df['nm'].combine(df['v'], lambda k,vd: getVal(dtd,k,'sv','sv1',vDef=vd))
or you can also use list comprehension (I find it more readable, but it's mostly just a personal preference)
df['v'] = [getVal(dtd,k,'sv','sv1',vDef=vd) for k,vd in zip(df['nm'],df['v'])]
whichever of the 3 methods is used, the resulting df
should look like
╒════╤══════╤══════╤═════╤══════════════╕
│ │ id │ nm │ v │ d │
╞════╪══════╪══════╪═════╪══════════════╡
│ 0 │ 0 │ pn1 │ 500 │ {'k1': 'v1'} │
├────┼──────┼──────┼─────┼──────────────┤
│ 1 │ 1 │ pn2 │ 400 │ {'k2': 'v2'} │
├────┼──────┼──────┼─────┼──────────────┤
│ 2 │ 2 │ pn3 │ 0 │ {'k3': 'v3'} │
╘════╧══════╧══════╧═════╧══════════════╛
[ printed with print(df.to_markdown(tablefmt='fancy_grid'))
]