I'm reading files from HDFS using Spark Structured Streaming API. but the schema is not fixed. I'm using:
sql.streaming.schemaInference: "true"
so the schema might be different for every batch. So if I try to select rows using:
dataframe.select(columnName)
it will complain if the column doesn't exist. So is there a way to check if the column doesn't exist before selecting it when using structured streaming api?