I have the following dataframe df in Spark Scala:
id project start_date Change_date designation
1 P1 08/10/2018 01/09/2017 2
1 P1 08/10/2018 02/11/2018 3
1 P1 08/10/2018 01/08/2016 1
then get designation closure to start_date and less than that
Expected output:
id project start_date designation
1 P1 08/10/2018 2
This is because change date 01/09/2017 is the closest date before start_date.
Can somebody advice how to achieve this?
This is not selecting first row but selecting the designation corresponding to change date closest to the start date