I already have a SparkContext created and a Spark global variable. When I read ORC files, I can read them as simple as spark.read.format("orc").load("filepath")
however, for avro I can't seem to do the same even though I try to import the jar like so:
spark.conf.set("spark.jars.packages",
"file:///projects/apps/lib/spark-avro_2.11-3.2.0.jar")
Error:
and then try to read the avro file. I get an error like so:
Py4JJavaError: An error occurred while calling o65.load.
: org.apache.spark.sql.AnalysisException: Failed to find data source: avro. Please find an Avro package at http://spark.apache.org/third-party-projects.html;