I have a file that I can correctly read this way:
sqlContext.read.format('csv').options(header='false', inferSchema='true', delimiter = "\a", nullValue = '\\N').load('adl://resource.azuredatalakestore.net/datalake-prod/raw/something/data/something/date_part={}/{}'.format(elem[0], elem[1]))
problem is that there is no header, the header is actually in another file of type avsc
, an Apache Avro schema object.
What's the best way to use it as header of my DF?
I'm running pyspark on Azure Databricks.