For example I am having a dataframe which needs some processing and conversions on the columns and I am overriding the existing dataframe again and again like the code is given below
var fd = (spark.read.format("csv")
.option("inferSchema", "false")
.option("header", "true")
.load(csvFile))
fd = fd.withColumn("date", col("date").cast("String"))
I am new to spark, so don't know any better approach to this kind of operation.
Any suggestion?