Is it possible to run Shark queries over the data contained in the DStreams of a Spark Streaming application? (for istance inside a foreachRDD call)
Are there any specific API to do that?
Thanks.
Is it possible to run Shark queries over the data contained in the DStreams of a Spark Streaming application? (for istance inside a foreachRDD call)
Are there any specific API to do that?
Thanks.
To answer my question if someone is worried about the same problem: the direct answer to my question is NO, you cannot run Shark directly on Spark Streaming data.
Spark SQL is currently a valid alternative, at least it was for my needs. It is included in Spark and doesn't require more configuration, you can have a look at it here: http://spark.apache.org/docs/latest/sql-programming-guide.html