I have stored XML files in S3 bucket and want to read them on EMR after typing:
sqlContext.read.format("com.databricks.spark.xml").option("rowTag", "Profile").load(xml_file_path)
It gave me errors:
An error occurred while calling o445.load. : java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark.apache.org/third-party-projects.html