I have a dataframe which has missing values in a row, and I use
df.ffill(axis=1, inplace=True)
to perform the transformation using Pandas.
I want to understand what would be the PySpark equivalent way to achieve this. I have read about using Window functions but those work over the column axis.
Example :
Input :
id | value1 | value2 | value3 | value4 | value5 |
---|---|---|---|---|---|
A | 2 | 3 | NaN | NaN | 6 |
B | 1 | NaN | NaN | NaN | NaN |
Output :
id | value1 | value2 | value3 | value4 | value5 |
---|---|---|---|---|---|
A | 2 | 3 | 3 | 3 | 6 |
B | 1 | 1 | 1 | 1 | 1 |