2

I have 2 dataframes.

In df1 I have a lot of NaN which I want to substitute by values in df2. The number of values in df2 is the same of number of NaN in df1.

I have tried to join, merge and create cycle, but without success.

Thanks in advance!

pd.Dataframe 1
0          NaN
1        240.0
2        229.0
3       1084.0
4       2078.0
        ....
Name: Healthcare_1, Length: 9999, dtype: float64

pd.Dataframe 2
0        830.0
6        100.0
7        100.0
8        830.0
9       1046.0
         ...  
Name: Healthcare_1, Length: 4797, dtype: float64
NIk
  • 123
  • 5

2 Answers2

1

In my answer I assume that the rows where the NANs occur in DataFrame1 have the same index as the rows in DataFrame2 that need to substitute these NANs.

Load the following modules:

import pandas as pd
import numpy as np

We have two example DataFrames:

df1 = pd.DataFrame({'c1': [np.nan, 240, np.nan, 1084, 2078]})
df2 = pd.DataFrame({'c1': [830, 100, 100, 830, 1046]}, index=[0,2,7,8,9])

Determine the indices where NANs occur in df1:

ind = list(np.where(df1['c1'].isnull()))[0]

Check where these indices occur in df2. This should give array([ True, True, False, False, False]):

df2.index.isin(list(ind))

Replace the values from df1 with the values from df2 at the index ind:

df1[df1.index.isin(ind)] = df2[df2.index.isin(ind)]

enter image description here

Ruthger Righart
  • 4,799
  • 2
  • 28
  • 33
1

Solution 1: Use .update() to replace nan values in df1 by the corresponding value in df2:

df1 = pd.Series([np.nan, 240, 229, 1084, 2078])
df2 = pd.Series([830, 100, 100, 830, 1046], index=[0, 6, 7, 8, 9])

df1.update(df2)

Solution 2: You can also use .combine_first() to fill up the np.nan values of the first dataframe with the values of the second dataframe:

df1.combine_first(df2).iloc[df1.index]

Resulting dataframe:

    0
0   830.0
1   240.0
2   229.0
3   1084.0
4   2078.0
Sander van den Oord
  • 10,986
  • 5
  • 51
  • 96
  • Thanks. Used data.loc[data['Healthcare_1'].index.isin(y_pred_HC1_NaN.index), ['Healthcare_1']] = y_pred_HC1_NaN['Healthcare_1'] – NIk Nov 13 '20 at 18:27