We are trying to create a dataset which reads folder having both excel, txt and csv files using following:
SparkSession.option("header", "true")
.option("delimiter", delimiter).option("ignoreTrailingWhiteSpace",true)
.option("ignoreLeadingWhitespaces", true)
.csv(directoryPath + "\\" + feedFolder + "\\*");
CSV api, is reading excel files as well, thus creating garbage data like below. How can we not read .xlsx file using CSV api of Apache Spark? Kindly let know
|?�9L�ҙ�sbgٮ |�l!��USh9i�b�r:"y_dl��D��� |-N��R"4�2�G�%��Z�4�˝y�7\të��ɂ���