Spark SQL - load data with JDBC using SQL statement, not table name

Question

I think I am missing something but can't figure what. I want to load data using SQLContext and JDBC using particular sql statement like

select top 1000 text from table1 with (nolock)
where threadid in (
  select distinct id from table2 with (nolock)
  where flag=2 and date >= '1/1/2015' and  userid in (1, 2, 3)
)

Which method of SQLContext should I use? Examples I saw always specify table name and lower and upper margin.

Thanks in advance.

zero323 · Accepted Answer · 2017-02-06T12:10:35.027

7

You should pass a valid subquery as a dbtable argument. For example in Scala:

val query = """(SELECT TOP 1000 
  -- and the rest of your query
  -- ...
) AS tmp  -- alias is mandatory*"""   

val url: String = ??? 

val jdbcDF = sqlContext.read.format("jdbc")
  .options(Map("url" -> url, "dbtable" -> query))
  .load()

* Hive Language Manual SubQueries: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SubQueries

edited Feb 06 '17 at 12:10

answered Dec 19 '15 at 04:42

zero323

322,348
103
959
935

1

Thank you. It worked. I just used a read.jdbc instead of read.format sqlContext.read.jdbc(url, table, prop) – Igor K. Dec 19 '15 at 16:54

Charlie 木匠 · Answer 2 · 2018-02-01T22:45:28.100

-3

val url = "jdbc:postgresql://localhost/scala_db?user=scala_user"
Class.forName(driver)

val connection = DriverManager.getConnection(url)

val df2 = spark.read
      .format("jdbc")
      .option("url", url)
      .option("dbtable", "(select id,last_name from emps) e")
      .option("user", "scala_user")
      .load()

The key is "(select id,last_name from emps) e", here you can write a subquery in place of table_name.

edited Feb 01 '18 at 22:45

answered Feb 01 '18 at 20:41

Charlie 木匠

2,234
19
19

1

Can you add more details and an explanation? Please read [How to write a good answer](https://stackoverflow.com/help/how-to-answer) – Spoody Feb 01 '18 at 21:01
Sure. There will be a blog on this. – Charlie 木匠 Feb 01 '18 at 22:44

Spark SQL - load data with JDBC using SQL statement, not table name

2 Answers2

Linked

Related