I have a huge data set partitioned by month. I am able to write the parquet files using the spark.write.parquet method. It works fine when trying to read using the spark itself. Parquet files don't have the partition columns and it is represented by the folders they reside. When trying to read the parquet files using external programs (like polybase), we cannot tell the month to which the file belongs to.
Is there any way to force spark to include the partition columns in the parquet files? Is there any other alternatives?