I have an Excel sheet where one column has XMLs (each row is a different xml). I am trying to use Pyspark and spark-xml to parse these, by doing df = spark.read.format('xml').options(rowTag = 'book').load(___)
.
The load
works fine when you specify an xml file, but is it possible to read in the Excel sheet and loop in those xmls to be parsed without having to convert each one to its own xml file?