Is that possible to perform set of operations on dataframe (adding new columns, replace some existing values, etc) and do not fast fail on first failed rows, but instead perform full transformation and separately return rows that has been processed with errors?
Example: it's more like pseudocode, but the idea must be clear:
df.withColumn('PRICE_AS_NUM', to_num(df["PRICE_AS_STR"]))
to_num - is my custom function of transformation string to number.
assuming I have some records where price can't be cast to number - I want to get those records in separate dataframe.
I see an approach, but it will make code a little ugly (and not quite productive): do a filter with try catch - if exception happen - filter those records into separate df.. What if I have many of such transformations... Is any better way?