Replace NaN values in dataframe column with values from another data frame when column values match

Question

I have a dataframe "df" that has many thousands of rows and NaN values in "Waist":

Dataframe "df"

Height  Weight  Bust  Cup     Waist   Sales    Returns
64      130     34    4       NaN     10       1
65      130     34    3       27      11       4
66      130     34    3       26      9        5
65      130     34    3       NaN     3        2
64      130     34    3       28      5        1
64      130     34    3       NaN     6        1
67      129     33    4       26      7        4
68      130     33    3       27      6        3
66      130     33    3       27      4        3

I have a second dataframe "dfwaist" which is smaller in both number of rows and columns. 'dfwaist' only contains Height, Weight, Bust, Cup, Waist, and Count.

I'd like to return the value of "Waist" from 'dfwaist' where the values for Height, Weight, Bust, and Cup match 'df' and replace the NaN values in 'df' with the returned values. If there is no match, leave the NaN in place.

Dataframe 'dfwaist'

Height  Weight  Bust  Cup     Waist   Count
64      130     34    4       27      79    
65      130     34    3       27      76     
66      130     34    3       26      72        
65      130     34    3       26      63        
64      130     34    3       28      55       
64      130     34    3       28      46        
67      129     33    4       26      47        
68      130     33    3       27      36        
66      130     33    3       27      24

Desired output for 'df'

Height  Weight  Bust  Cup     Waist   Sales    Returns
64      130     34    4       27      10       1
65      130     34    3       27      11       4
66      130     34    3       26      9        5
65      130     34    3       26      3        2
64      130     34    3       28      5        1
64      130     34    3       28      6        1
67      129     33    4       26      7        4
68      130     33    3       27      6        3
66      130     33    3       27      4        3

Is there a way to return and overwrite NaN values in my first data frame with those found in the second conditionally?

I've already tried .combine_first() and .merge() and both fail because both assume the same shape between the data frames as is suggested in Pandas fill missing values in dataframe from another dataframe

.fillna() also assumes the same shape between the dataframes as is suggested in proper way to replace NaN value from another dataframe on column match in pandas It also requires a common index to be set, but a common index doesn't exist between the two dataframes.

.combine_first() is promising, but it would fill at most one of the rows in "df" and wouldn't respect the combination of Height, Weight, Bust, and Cup required before combining "df" and "dfwaist". This was suggested in how can I replace NaN value with data from another dataframe in python?

replacing values in a pandas dataframe with values from another dataframe based common columns will not preserve the order of the data, and requires the creation of a third dataframe.

Lastly, I've tried disregarding the second data frame entirely and filling NaN values with the .mode() of the Waist column when the combination of Height, Weight, Bust, and Cup is grouped in a .groupby(). The first V1 fails because of a key error: 0. The second fails to run. The third runs, but imputes NaN values over previously "good" values.

# For every Height, Weight, Bust, Cup combination, populate the 'Waist' feature with the mode of the combination.
# V1
#df['Waist'] = df.groupby(['Height', 'Weight', 'Bust', 'Cup','Waist']).Waist.apply(lambda x: x.fillna(x.mode()[]))
# V2
#df['Waist'] = df.groupby(['Height', 'Weight', 'Bust', 'Cup','Waist'])['Waist'].apply(lambda x: x.fillna(x.mode([0])))
# V3
#df['Waist'] = df.groupby(['Height','Weight','Bust','Cup','Waist']).agg(pd.Series.mode).apply(lambda x: x.fillna(x.mode([4])))

Replace NaN values in dataframe column with values from another data frame when column values match

0 Answers0