0

Does Spark supports structured streaming with Kinesis stream as data source? It appears Databricks version supports - https://docs.databricks.com/structured-streaming/kinesis-best-practices.html. However does Spark outside of Databricks support this feature?

Zach King
  • 798
  • 1
  • 8
  • 21

1 Answers1

1

Yes, you can use the following open source connector: https://github.com/roncemer/spark-sql-kinesis

Example:

// Stream data from the "test" stream
// Note: if running on AWS EC2, you can omit the secret and access keys in lieu of the attached IAM role on the EC2 instance

val kinesis = spark
    .readStream
    .format("kinesis")
    .option("streamName", "spark-streaming-example")
    .option("endpointUrl", "https://kinesis.us-east-1.amazonaws.com")
    .option("awsAccessKeyId", [ACCESS_KEY])
    .option("awsSecretKey", [SECRET_KEY])
    .option("startingposition", "TRIM_HORIZON")
    .load
Zach King
  • 798
  • 1
  • 8
  • 21