3

I get the following error:

18/03/14 15:31:11 ERROR ApplicationMaster: User class threw exception: org.apache.spark.sql.AnalysisException: Table or view not found: products; line 1 pos 42

This is my code:

val spark = SparkSession
                .builder()
                .appName("Test")
                .getOrCreate()

val products = spark.read.parquet(productsPath)
products.createGlobalTempView("products")

val q1 = spark.sql("SELECT PERCENTILE(product_price, 0.25) FROM products").map(_.getAs[Double](0)).collect.apply(0)

What am I doing wrong? Is it possible to do the same thing in Spark without using sql?

ZygD
  • 22,092
  • 39
  • 79
  • 102
Markus
  • 3,562
  • 12
  • 48
  • 85

3 Answers3

6

TEMPORARY VIEW

Just use createOrReplaceTempView as

products.createOrReplaceTempView("products")

val q1 = spark.sql("SELECT PERCENTILE(product_price, 0.25) FROM products").map(_.getAs[Double](0)).collect.apply(0)

GLOBAL TEMPORARY VIEW

If you use global temp view then you should do

products.createGlobalTempView("products")

val q1 = spark.sql("SELECT PERCENTILE(product_price, 0.25) FROM global_temp.products").map(_.getAs[Double](0)).collect.apply(0)
Ramesh Maharjan
  • 41,071
  • 6
  • 69
  • 97
1

All the global temporary views are created in Spark preserved temporary global_temp database.

Below should work-

val q1 = spark.sql("""SELECT PERCENTILE(product_price, 0.25) 
    FROM global_temp.products""").map(_.getAs[Double](0)).collect.apply(0)

Spark has 2 different types of views, Tempview and globalTempView, see post here for more details.

philantrovert
  • 9,904
  • 3
  • 37
  • 61
Rahul Sharma
  • 5,614
  • 10
  • 57
  • 91
1

If you want to use sql API you can try

import org.apache.spark.sql.expressions.Window

val wdw =  Window.partitionBy($"Field1", $"Field2").orderBy($"Field".asc)

products.withColumn("percentile",functions.ntile(100).over(wdw))
Angel F O
  • 76
  • 6