I would like to store result of continuous queries running against streaming data in such a manner so that results are persisted into distributed nodes to ensure failover and scalability.
Can Spark SQL experts please shed some light on - (1) which storage option I should choose so that OLAP queries are faster - (2) how to ensure data available for query even if one node is down - (3) internally how does Spark SQL store the resultset ?
Thanks Kaniska