Is there a better way to fill missing values between two dataframes?

Question

I'm trying to fill missing values in one dataframe based on the data in other dataframes. My first dataframe looks like this

[df1]:

Key Value
1   15
2   30
23  NaN
1   15
4   60
2   30
5   NaN
2   NaN
4   NaN
25  NaN
62  NaN
32  NaN
21  315
21  NaN

And the dataframe I want to extract the information from looks like this one:

[df2]:
Key   Value
23        "fine"
5         "five"
2         30
4         60
25        "two"
62        "bar"
32        "foo"
21        315

So my output would look something like this:

   Key     Value
    2   30.0
   23  "fine"
    1   15.0
    4   60.0
    2   30.0
    5   "five"
    2   30.0
    4   60.0
   25  "two"
   62  "bar"
   32  "foo"
   21  315.0
   21  315.0

I've managed to do this iterating over the rows of both dataframes, using

for i, value in df1.iterrows():
    for j, value2 in df2.iterrows():
        if df1.loc[i,"Key"]==df2.loc[j,"Key"]:
            df1.loc[i,"Value"]=df2.loc[j,"Value"]
        else:
            pass

but using this with my actual information (Dataframes of around 100000 rows and 20+ columns each) takes too long and I believe there's a better way to do it with a pandas built-in function. Thanks in advance!

Does this answer your question? [Pandas fill missing values in dataframe from another dataframe](https://stackoverflow.com/questions/29357379/pandas-fill-missing-values-in-dataframe-from-another-dataframe) — hilberts_drinking_problem, Jan 20 '22 at 20:43
df1 = df1.fillna(df2) does not do the trick since not all values in df2 are present in df1 and even though it fills the NaN values, those values do not correspond to the keys — Juan Sebastián Rubiano Jara, Jan 20 '22 at 21:00
One of the answers mentions `df.update` which might work in your case. — hilberts_drinking_problem, Jan 20 '22 at 21:04
I just tried it but since '''df.update''' aligns on idices it does not work for dataframes of different dimensions — Juan Sebastián Rubiano Jara, Jan 20 '22 at 21:13

score 1 · Answer 1 · answered Jan 20 '22 at 20:47

1

You can achieve that with only one line of code:

df1 = df1.fillna(df2)

answered Jan 20 '22 at 20:47

peru_45

330
3
16

score 0 · Answer 2 · answered Jan 21 '22 at 03:26

0

try this:

f = df1.query('Value.isnull()').ffill(axis=1).Value.map(dict(df2.values))
df1.loc[f.index, 'Value'] = f
print(df1)

answered Jan 21 '22 at 03:26

ziying35

1,190
3
6

Is there a better way to fill missing values between two dataframes?

2 Answers2