7

According to this page: https://spark.apache.org/sql/ you can connect existing BI tools to Spark SQL via ODBC or JDBC: screen shot for spark sql

I don't mean Shark as this is basically EOL:

It is for this reason that we are ending development in Shark as a separate project and moving all our development resources to Spark SQL, a new component in Spark.

How would a BI tool (like Tableau) connect to shark sql via ODBC?

Chris Matta
  • 3,263
  • 3
  • 35
  • 48

6 Answers6

4

With the release of Spark SQL 1.1 you also have thrift JDBC driver see https://spark.apache.org/docs/latest/sql-programming-guide.html#distributed-sql-engine

nealmcb
  • 12,479
  • 7
  • 66
  • 91
Arnon Rotem-Gal-Oz
  • 25,469
  • 3
  • 45
  • 68
3

Simba provides the ODBC driver that Databricks uses, however that is only for the Databricks distribution. We are launching the public version for use with Apache tomorrow (Wed, Dec 3rd) at www.simba.com. You'll be able to download and trial the driver for use with Tableau then.

KylePorter
  • 469
  • 2
  • 6
1

Please take a look at: http://www.openstratio.org/blog/connecting-to-the-stratio-big-data-platform-using-odbc-2/

Stratio is a platform that includes a certified Spark distribution that allows you to connect Spark to any type of data repository (like Cassandra, MongoDB,...). It has an ODBC Driver so you can write SQL queries that will be translated to Spark jobs, or even faster, direct queries to Cassandra -or whichever database you want to connect to it - if possible. This way, it is pretty simple to connect Tableau into Spark and your data repository. If you need any help, we will be more than glad to assist you.

Disclaimer: I'm one of Stratio's ODBC developers

Carlos
  • 331
  • 2
  • 8
  • This doesn't answer the question, my question was if there was an ODBC driver for spark and your answer is "Use this whole different platform". – Chris Matta Oct 03 '14 at 19:56
  • Not that I know of, I'm sorry, you could also try to use an ODBC driver for Shark (but you would need to use Shark). – Carlos Oct 06 '14 at 14:38
1

As Carlos said, Stratio Meta is a module that acts as a parser, validator, planner and coordinator layer over the different persistence layers (currently, only Cassandra and Mongo, but also HDFS in the short term). This modules offers a Shell with a SQL-like language, a Java/Scala API, a REST API and ODBC (JDBC shortly). It also uses another Stratio module, Stratio Deep, which allows us to use Apache Spark in order to execute query in an efficent and fast way.

Disclaimer: I am currently employed by Stratio Big Data

miguel0afd
  • 305
  • 3
  • 15
1

Simba will offer one: http://databricks.com/blog/2014/04/30/Databricks-selects-Simba-ODBC-driver-for-shark.html. No known official release date.

[update]

Use HIVE's ODBC driver to connect to Spark SQL as described here and here.

Martin Tapp
  • 3,106
  • 3
  • 32
  • 39
0

For Spark on Azure HDInsight, you can connect Tableau (or PowerBI) as described here https://azure.microsoft.com/en-us/documentation/articles/hdinsight-apache-spark-use-bi-tools/. The ODBC driver is here: http://www.microsoft.com/en-us/download/details.aspx?id=47713

benjguin
  • 1,496
  • 1
  • 12
  • 21
  • The steps in the link are creating a Hive table and then using Power BI on top of Hive. As far as I can tell it is not connecting to Spark at all. – Kit Menke Feb 17 '16 at 14:26