-3

I have created a dataframe from a CSV file. It has 10 columns two of which are actress and movie title. I want to make actress as a key and title as a value and further want to reduce it by key to get the list of movies for every actress. For that case I have to map actress column to movie title column first. So how to get the tuples of actress, movie tile key value pair in Spark scala. Further, I want to do it using basic operations not SparkSQL.

1 Answers1

-1

Suggestion : Low question quality, you should look for examples online first and then

val df = ???

val moviesByActressDF = df.groupBy("actress_col")
.agg(collect_list("movie_col"))

Hope this helps, Cheers

Chitral Verma
  • 2,695
  • 1
  • 17
  • 29