Following is my source data,
Name |Date |
+-----+----------+
|Azure|2018-07-26|
|AWS |2018-07-27|
|GCP |2018-07-28|
|GCP |2018-07-28|
I have partitioned the data using Date column,
udl_file_df_read.write.format("csv").partitionBy("Date").mode("append").save(outputPath)
val events = spark.read.format("com.databricks.spark.csv").option("inferSchema","true").load(outputPath)
events.show()
The output column names are (c0,Date)
. I am not sure why the original column name is missing and how do I retain the column names?
Note This is not a duplicate question because of the below reasons Here columns other than partition columns are renamed as c0 and specifying base-path in option doesn't work.