I'm following an example for PCA analysis in Spark 3.0.0, using Scala 2.12.10. I'm having trouble understanding some of the nuances of Scala and I'm quite new to programming in Scala.
After defining the data as such:
val data = Array(
Vectors.sparse(5, Seq((1, 1.0), (3, 7.0))),
Vectors.dense(2.0, 0.0, 3.0, 4.0, 5.0),
Vectors.dense(4.0, 0.0, 0.0, 6.0, 7.0)
)
the dataframe is created as such:
val df = spark.createDataFrame(data.map(Tuple1.apply)).toDF("features")
My question is: what does data.map(Tuple1.apply)
do? I guess what bugs me is the fact apply doesn't have arguments.
Thank you in advance! Perhaps someone can also recommend me a good beginner Scala / Spark book so my questions can be better ones in the future?