Spark with HiveContext - AnalysisException: Could not resolve window function 'first_value'

Question

What is wrong with this usage of first? I want to take the first row for each id in my dataframe, however it returns an error:

Exception in thread "main" org.apache.spark.sql.AnalysisException: Could not resolve window function 'first_value'. Note that, using window functions currently requires a HiveContext;

The code is:

WindowSpec window = Window.partitionBy(df.col("id"));
df= df.select(first(df.col("*")).over(window));

I am using a HiveContext.

Can you - for tests - try following code: `WindowSpec window = Window.partitionBy(df.col("id")); df= df.select(first(df.col("id")).over(window));` It's possible that window function cannot be used with * — T. Gawęda, Sep 09 '16 at 10:27

score -1 · Answer 1 · edited May 23 '17 at 12:32

-1

Did you read/create your spark dataframe with SparkContext or HiveContext? Window functions require HiveContext to be used

More detail here: Window function is not working on Pyspark sqlcontext

edited May 23 '17 at 12:32

Community

1
1

answered Sep 09 '16 at 09:17

phi

10,572
3
21
30

Author wrote, that he is using HiveContext – T. Gawęda Sep 09 '16 at 09:19

Spark with HiveContext - AnalysisException: Could not resolve window function 'first_value'

1 Answers1