1

My understanding is that Spark structured streaming is build on top of Spark SQL and not Spark Streaming. Hence, the following question, does the properties that apply to spark streaming also applies to spark structured streaming such as:

spark.streaming.backpressure.initialRate spark.streaming.backpressure.enabled spark.streaming.receiver.maxRate

thebluephantom
  • 16,458
  • 8
  • 40
  • 83
MaatDeamon
  • 9,532
  • 9
  • 60
  • 127

2 Answers2

0

No, these settings are applicable only to DStream API. Spark Structured Streaming does not have a backpressure mechanism. You can find more details in this discussion: How Spark Structured Streaming handles backpressure?

tashoyan
  • 418
  • 1
  • 3
  • 12
0

No.

Spark Structured Stream processes data asap by default - after finishing the current batch. You can control via the rate of processing for various types, e.g. maxFilesPerTrigger for files and maxOffsetsPerTrigger for KAFKA.

This link http://javaagile.blogspot.com/2019/03/everything-you-needed-to-know-about.html explains that back pressure is not relevant.

  • It quotes: "Structured Streaming cannot do real backpressure, because, such as, Spark cannot tell other applications to slow down the speed of pushing data into Kafka.".
    • I am not sure this aspect is relevant as KAFKA buffers the data. None-the-less the article has good merit imho.
thebluephantom
  • 16,458
  • 8
  • 40
  • 83