I have a function written which converts the datatype of a dataframe to the specified schema in Pyspark. Cast function silently makes the entry as Null if it is not able to convert to the respective datatype.
e.g. F.col(col_name).cast(IntegerType())
will typecast to Integer and if the column value is Long it will make that as null.
Is there any way to capture the cases where it converts to Null? In a data pipeline that runs daily, if those are not captured it will silently make them Null and pass to the upstream systems.