3

How can we process data from JMS sources (like Solace) in spark structured streaming? Could see implementations with Dstreams but nothing using new structured streaming. Any pointers or tips spark structured streaming and JMS would be great?

spats
  • 805
  • 1
  • 10
  • 12
  • Hi, did you find any information on it? – Alex Sergeenko Nov 11 '19 at 13:07
  • I'm not familiar with Spark and this question is definitely old, but for others that come across this question Solace has this integration guide that shows Solace PubSub+ as a JMS provider for an Apache Spark Streaming custom receiver in case it's useful. https://docs.solace.com/Developer-Tools/Integration-Guides/Spark-Streaming.htm – Mrc0113 Apr 29 '20 at 15:22
  • @AlexSergeenko There is no structured streaming support for JMS and didn't find any good libraries for it. We did implement our own spark receiver for Solace inspired from https://github.com/srnghn/spark-mq-receiver, but it was way to complex & lot of drawbacks so we just built simple java receiver distributed/parallalized through yarn. – spats Jul 31 '20 at 12:20
  • @Mrc0113 lot of issues with solace Spark JMS guide/link. 1. It's based on very old spark version 1.3 so no structured streaming and streaming logic is based outdated not recommended streaming way in latest versions. So no structured streaming. 2. Doesn't completely ensure no data loss & checkpoint will have issues when data schema evolves. – spats Jul 31 '20 at 12:28
  • Thanks for the info @spats. Any chance you open sourced the spark received for Solace that you ended up implementing? – Mrc0113 Aug 04 '20 at 16:17
  • We ended up using plain java consumer/processor for Solace, instead of using Solace as native Spark data source. Spark + Yarn is just used for running this code in container (in each executor) – spats Feb 26 '22 at 01:49

0 Answers0