I am looking for a way to get rows of a data frame based on the min value of a specific column in a group by operation.
As answered on this question displays a perfect example and also includes a working answer.
However, the operation is very computationally expensive. It might work on simple datasets, but on large data sets, it will be a burden and take a very long time to run.
When running it on a SQL query, it is possible to use the ROW_NUMBER function to filter and get the min value based on the row number as shown here. It seems to be much faster, but what to do when I already have a pandas dataframe?
I reckon there might be a cheaper way to execute this operation.
Thanks, everyone.