I am trying to drop duplicated rows based on the column id. How can i get the dropped data which have duplicate "id"? This is the code that I've been working on for now.
val datatoBeInserted = data.select("id", "is_enabled", "code", "description", "gamme", "import_local", "marque", "type_marketing", "reference", "struct", "type_tarif", "family_id", "range_id", "article_type_id")
val cleanedData = datatoBeInserted.dropDuplicates("id")
Using the above query, cleanedData will give all rows without duplicates of "id". Now, I want to figure out which rows have been filtered out because of duplicates.