What is wrong with this usage of first
? I want to take the first row for each id
in my dataframe, however it returns an error:
Exception in thread "main" org.apache.spark.sql.AnalysisException: Could not resolve window function 'first_value'. Note that, using window functions currently requires a HiveContext;
The code is:
WindowSpec window = Window.partitionBy(df.col("id"));
df= df.select(first(df.col("*")).over(window));
I am using a HiveContext
.