I'm really desperate to get some data sorted. I have some data that I would like to sort in CSV using Python. If anyone could help, it would be much appreciated. It can't be sorted in Excel because there are too many rows (>15 million).
The format is as below:
X,Y,Level,ID,XYID
15.5,16.5,1.6,HB01,15.516.5
15.5,17.5,1.4,HB01,15.517.5
15.5,18.5,1.7,HB01,15.518.5
15.5,19.5,1.6,HB01,15.519.5
15.5,20.5,1.2,HB01,15.520.5
15.5,20.5,1.9,HB02,15.520.5
15.5,20.5,2.5,HB03,15.520.5
15.5,20.5,2.1,HB04,15.520.5
The XYID is just a concatenated text of X and Y. I want the data to be sorted so that in the second block, the only row that comes out is as below because it has the highest level (third column) and has the same XYID
15.5,20.5,2.5,HB03,15.520.5
And I want the final output to be a csv file with showing the first four rows because they have different XYID and the new 5th row with the one that shows the maximum level:
X,Y,Level,ID,XYID
15.5,16.5,1.6,HB01,15.516.5
15.5,17.5,1.4,HB01,15.517.5
15.5,18.5,1.7,HB01,15.518.5
15.5,19.5,1.6,HB01,15.519.5
15.5,20.5,2.5,HB03,15.520.5