Does Spark Supports With Clause like SQL?

Question

I have table employee_1 in spark with attributes id and name(with data), and another table employee_2 with same attributes, i want to load the data by increasing the id values with +1

My With Clause shown below:

WITH EXP AS (SELECT  ALIASNAME.ID+1 ID, ALIASNAME.NAME NAME FROM employee_1 ALIASNAME)
INSERT INTO TABLE employee_2 SELECT * FROM EXP;

Steps of execution:

I have a file(with data) in HDFS location.

Creating RDD based on hdfs location.
RDD to Hive temp table
from temp table to Hive Target (employee_2).

when i am running with test program from backend its succeeding. but data is not loading. employee_2 is empty.

Note: If you run the above with clause in Hive it will succeed and data will load. But in spark it won't in 1.6 ?

On Spark 2.x it definitely supports WITH statements. I can try to dust off my old 1.6 to verify for you but fundamentally it looks like it should run — Garren S, May 17 '17 at 03:47
Thanks, Used 1.6, 2.x versions working.. in sparksql context — Ganesh Kumar, Dec 22 '17 at 07:13

score 6 · Accepted Answer · edited Oct 03 '22 at 08:13

The WITH statement is not the problem, but rather the INSERT INTO statement that's causing trouble.

Here's a working example that uses the .insertInto() style instead of the "INSERT INTO" SQL:

val s = Seq((1,"foo"), (2, "bar"))
s: Seq[(Int, String)] = List((1,foo), (2,bar))
val df = s.toDF("id", "name")
df.registerTempTable("df")
sql("CREATE TABLE edf_final (id int, name string)")
val e = sql("WITH edf AS (SELECT id+1, name FROM df cook) SELECT * FROM edf")
e.insertInto("edf_final")

Another option is to use the df.write.mode("append").saveAsTable("edf_final") style.

Relevant SO: "INSERT INTO ..." with SparkSQL HiveContext

score 0 · Answer 2 · answered Apr 09 '23 at 05:22

Since answer is for Spark 2.x I am rewriting in spark 3 way

%scala

import org.apache.spark.sql.functions.col

val s = Seq((1,"foo"), (2, "bar"))
val df = s.toDF("id", "name")
df.createOrReplaceTempView("df")
spark.sql("CREATE TABLE if not exists edf_final (id int, name string)")
val e = spark.sql("WITH edf AS (SELECT id+1 AS id, name FROM df) SELECT * FROM edf")
e.select(col("id"), col("name")).write.insertInto("edf_final")

spark.sql("select * from edf_final").show

Result :

df:org.apache.spark.sql.DataFrame
id:integer
name:string
e:org.apache.spark.sql.DataFrame
id:integer
name:string
+---+----+
| id|name|
+---+----+
|  2| foo|
|  3| bar|
+---+----+

Does Spark Supports With Clause like SQL?

My With Clause shown below:

Steps of execution:

2 Answers2