1

I have a requirement where we want to stream a large dateset held by RDD in Spark to the Driver. We can't call collect() or take() on the RDD from the Driver to avoid OOMs, but is there anyway that data can be streamed by using some intermediate channel? i.e. pushing the RDD data on to a stream and the Driver reading from that stream?

Nishant
  • 161
  • 1
  • 2
  • 10

0 Answers0