I have a spark-structured application connected to ActiveMQ. The application receives messages from a topic. These messages are in the form of a StringXML. I want to extract information from this nested-XML. How can I do this?
I referred to this post, but was not able to implement something similar in Scala.
XML Format:
<CofiResults>
<ExecutionTime>20201103153839</ExecutionTime>
<FilterClass>S </FilterClass>
<InputData format="something" id="someID"><ns2:FrdReq xmlns:ns2="http://someone.com">
<HeaderSegment xmlns="https://somelink.com">
<Version>6</Version>
<SequenceNb>1</SequenceNb>
</HeaderSegment>
.
.
.
My Code:
val df = spark.readStream
.format("org.apache.bahir.sql.streaming.mqtt.MQTTStreamSourceProvider")
.option("brokerUrl", brokerUrl_)
.option("topic", topicName_)
.option("persistence", "memory")
.option("cleanSession", "true")
.option("username", username_)
.option("password", password_)
.load()
val payload_ = df.select('payload cast "string") // This payload IS the XMLString
Now I need to extract ExecutionTime
, Version
, and other fields from the above XML.