I am using pyspark in databricks with a JSON file to clean data. The expression in the eval
brackets comes from the JSON file.
One of the issues I am facing is manipulating timestamp
s/string
.
I am trying to find the difference in months between a timestamp column and a single date (which is a string)
See code below.
import pyspark.sql.functions as F
df2 = df2.withColumn('test', eval("months_between( F.to_date(F.col('period_name')), lit('31/03/2019'))"))
It doesn't throw an error but evaluates to null.