In my "data" dataframe, I have 2 columns, 'time_stamp' and 'hour'. I want to insert 'hour' column values where 'time_stamp' values is missing. I do not want to create a new column, instead fill missing values in 'time_stamp'
What I'm trying to do is replace this pandas code to pyspark code:
data['time_stamp'] = data.apply(lambda x: x['hour'] if pd.isna(x['time_stamp']) else x['time_stamp'], axis=1)