I have a dataframe df as mentioned below:
**customers** **product** **val_id** **rule_name** **rule_id** **priority**
1 A 1 ABC 123 1
3 Z r ERF 789 2
2 B X ABC 123 2
2 B X DEF 456 3
1 A 1 DEF 456 2
I want to create a new dataframe df2, which will have only unique customer ids, but as rule_name and rule_id columns are different for same customer in data, so I want to pick those records which has highest priority for the same customer, so my final outcome should be:
**customers** **product** **val_id** **rule_name** **rule_id** **priority**
1 A 1 ABC 123 1
3 Z r ERF 789 2
2 B X ABC 123 2
Can anyone please help me to achieve it using Spark scala. Any help will be appericiated.