I have a CSV file where I need to dedupe entries where the FIRST field matches, even if the other fields don't match. In addition, the line that is left should be the one where one of the other fields with the highest date.
This what my data looks like:
"47917244","000","OTC","20180718","7","2018","20180719","47917244","20180719"
"47917244","000","OTC","20180718","7","2018","20180731","47917244","20180731"
"47917244","000","OTC","20180718","7","2018","20180830","47917244","20180830"
All 3 lines have the same value in the first field. The 9th field is a date field, and I want dedupe it in such a way that the third line, which has the highest date value, is kept, but the other two lines are deleted.