I'm not sure this is a bug (or just incorrect syntax). I searched around and didn't see this mentioned elsewhere so I'm asking here before filing a bug report.
I'm trying to use a Window function partitioned on a nested column. I've created a small example below demonstrating the problem.
import sqlContext.implicits._
import org.apache.spark.sql.functions._
import org.apache.spark.sql.expressions.Window
val data = Seq(("a", "b", "c", 3), ("c", "b", "a", 3)).toDF("A", "B", "C", "num")
.withColumn("Data", struct("A", "B", "C")).drop("A").drop("B").drop("C")
val winSpec = Window.partitionBy("Data.A", "Data.B").orderBy($"num".desc)
data.select($"*", max("num").over(winSpec) as "max").where("num = max").drop("max").show
The above results in an error
org.apache.spark.sql.AnalysisException: resolved attribute(s) A#39,B#40 missing from num#33,Data#37 in operator !Project [num#33,Data#37,A#39,B#40];
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:38)
at org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:44)
...
If instead those columns aren't nested, it works fine. Am I missing something with the syntax, or is this a bug?