1

I am trying to save Structured Steaming Dataset into given Cassandra table.

I am using datastax cassandra connector version spark-cassandra-connector_2-11.jar

While I try to save dataSet like below

dataSet
    .writeStream()
    .format("org.apache.spark.sql.cassandra")
    .option("table",table)
    .option("keyspace", keyspace)
    .outputMode("append")
    .start();

Throwing error :

Data source org.apache.spark.sql.cassandra does not support streamed writing

What should be done and how to handle this?

Alex Ott
  • 80,552
  • 8
  • 87
  • 132
BdEngineer
  • 2,929
  • 4
  • 49
  • 85
  • 1
    Maybe this can help: https://stackoverflow.com/questions/50037285/writing-spark-structure-streaming-data-into-cassandra – Shaido May 30 '19 at 08:48
  • @Shaido, thank you but i am using open source Cassandra 3.x version. Not a DSE. any suggestion, how the other people are doing ? – BdEngineer May 30 '19 at 13:05
  • 1
    I'm not too familiar with it myself to be honest. Did you see the second answer in the link above? It looks like it should work for Cassandra (not DSE). – Shaido May 30 '19 at 14:21
  • Possible duplicate of [Writing Spark Structure Streaming data into Cassandra](https://stackoverflow.com/questions/50037285/writing-spark-structure-streaming-data-into-cassandra) – Alex Ott Jun 14 '19 at 06:55

1 Answers1

1

There are several options regarding it:

  1. With Spark Cassandra Connector (SCC) version 2.x, Spark < 2.4, and OSS Cassandra, the only choice is to implement custom forEach operation, like it's done here;
  2. With Spark Cassandra Connector version 2.x, Spark >= 2.4, and OSS Cassandra, we can use forEachBatch with just normal write operation, like here;
  3. For DSE, we can just use data.writeStream().format("org.apache.spark.sql.cassandra"), as DSE Analytics has custom SCC;
  4. Starting with SCC 2.5, DSE-specific functionality is open for OSS Cassandra as well, so we can use it same way as for DSE, as shown in the docs.
Alex Ott
  • 80,552
  • 8
  • 87
  • 132