Replace Nan with group'd value

Question

I have the following dataframe:

df

 Index key1 | key2 | key3 | value1 | Value2   
 0    1     | 3    | 4    | 6      |  4 
 1    1     | 3    | 4    | Nan    |  3 
 2    1     | 2    | 3    | 8      |  6
 3    1     | 2    | 3    | Nan    |  5
 4    5     | 7    | 1    | Nan    |  2

For the value with the same keys (key1, key2, key3), I want to use numeric value and whenever there is no numeric value I want to remove the row. For value2, I simply want the sum.

Desired df

Index key1 | key2 | key3 | value1 | value2    
 0    1    | 3    | 4    | 6      | 7
 2    1    | 2    | 3    | 8      | 11

Retaining the correct index is not important.

The logic here is that there are two groups of (key1, key2, key3) that have non-NaN values in value1. There is a third group where key1, key2, key3 = 5, 7, 1, which I want to drop because there is no non-NaN value for value1 in this group.

Within each group, replace value1 (if it is NaN) with a non-NaN value1 value from the group. Finally, compute value2 to be the sum of all value2 values in the group.

`and whenever there is no numeric value I want to remove the row` - it means need remove NaNs rows by `value` column ? — jezrael, Mar 09 '21 at 06:33
The title is misleading. OP wants to group the df by a set of columns (namely, the `key*` columns). Within each group, if `value1` is NaN, use the `value1` value from another row within the same group (what happens if there are conflicting values? OP doesn't specify). Then, set `value2` to be the sum of all `value2` values within that group — inspectorG4dget, Mar 09 '21 at 16:21

Replace Nan with group'd value

0 Answers0

Linked