I have a directory that contains CSV files that have the same columns but not in the same order. I would like to append them in one CSV file but when do that with pyspark using the following code I get the csv but with mixed data inside (i.e. it it is not sorting out the order of the columns correctly).
from pyspark import SparkContext
from pyspark.sql import SQLContext
from pyspark.sql.functions import col
sc = SparkContext("local", "Simple App")
sqlContext = SQLContext(sc)
df = sqlContext.read.format('com.databricks.spark.csv').options(header='true', inferschema='true').load('/myPATH/TO_THE_CSV_FILES/')
df.coalesce(1).write.option("header", "true").format('com.databricks.spark.csv').save('/myPATH/TO_APPENDED_CSV_FILE/')