Error while reading a folder in azure databricks which has subfolders with parquet files

Question

I am reading a folder in adls in azure databricks which has sub folders containing parquet files.

path - base_folder/filename/

filename has subfolders like 2020, 2021 and these folders again have subfolders for month and day.

So path for actual parquet file is like - base_folder/filename/2020/12/01/part11111.parquet.

I am getting below error if I give a base folder path.

I have tried commands in below tread as well but it is showing same error. Unable to infer schema for Parquet. It must be specified manually

Please help me to read all parquet files in all sub folders in one dataframe.

Do you want to have fields containing `2020`, `12` and `01` in your dataframe? — Jasper-M, Nov 03 '21 at 13:51

score 1 · Accepted Answer · answered Nov 03 '21 at 13:42

1

Your first error: Unable to infer schema for Parquet usually happens when you try to read an empty directory as parquet. You can specify * in your path and it will go through subdirectories, take a look here: Reading parquet files from multiple directories in Pyspark.
Second error: You are using Scala API, and the example you've provided is in Python. DataFrameReader API is different. Ref: Scala - DataFrameReader - Python - DataFrameReader

Try with:

spark.read.format("parquet").load(landingFolder)

answered Nov 03 '21 at 13:42

vladsiv

Thanks Vlad. It worked in python using *. Is there a way I can achieve this using scala. – Sharyu Aadhatrao Nov 08 '21 at 09:58
1

@SharyuAadhatrao You're welcome. It should also work in scala, have you tried it? Please find examples here: [Read all files in a nested folder in Spark](https://stackoverflow.com/questions/32233575/read-all-files-in-a-nested-folder-in-spark), [Select files using a pattern match](https://kb.databricks.com/scala/pattern-match-files-in-path.html), also take a look at [Recursive File Lookup](https://spark.apache.org/docs/latest/sql-data-sources-generic-options.html#recursive-file-lookup) – vladsiv Nov 08 '21 at 13:02
@SharyuAadhatrao If this answers your question, mark it as answered please. – vladsiv Nov 08 '21 at 13:56

1 Answers1