Problem
I am working on a Python problem, where a car dealer wants to accumulate a list of vehicles where the total mileage across the chosen vehicles is a maximum (constraint 1, I don't know why he would want the highest mileage but it is what it is) and he has to remain under a certain budget (constraint 2, $300000).
Question
I know how to sort data based on 1 condition, but sorting it based on 2 values is trickier than I thought. What is the best way to achieve my problem? Please see my attempt below.
Small sample of Data
--------------------------------------------------
| Licence | Manufacturer | Price | Mileage
--------------------------------------------------
| 1 | Audi | 42000 | 8000
--------------------------------------------------
| 2 | Mercedes | 33000 | 15000
--------------------------------------------------
| 3 | Lexus | 38000 | 10000
--------------------------------------------------
| 4 | BMW | 25000 | 20000
--------------------------------------------------
| 5 | Mercedes | 55000 | 33000
--------------------------------------------------
My Attempt
I first thought of assigning a sort of weight between the mileage and price, because a car could have high mileage but its price could also be very high, so I thought sorting based just on mileage is incorrect. For example, suppose I have three cars, A, B and C. Car A has 10000 miles and costs $20000. Car B has 20000 miles but costs $40000. In this case, choosing either wouldn't make a difference. But suppose car C has 25000 miles, but it costs $80000. The algorithm should first consider Car A and B before considering C, even though C has the most mileage, it is not 'worth' the price.
So I created a new column that is the ratio between the mileage and price and sorted this list using that ratio as the key and then reversed it to get the ratios starting from the highest value. I then looped through this list, appending the cars to a new list if the total amount didn't exceed the budget.
cost = 0;
with open(fileName, 'r') as inputFile:
list1 = csv.reader(inputFile, delimiter=' ')
list2 = [(row[0], row[1], row[2], row[3], float(row[3])/float(row[2])) for l in list1]
list2.sort(key = lambda x: x[4])
list2.reverse()
cars2Buy = []
for l in list2:
if (cost + int(row[2])) <= 300000:
cost += int(row[2])
cars2Buy.append((row[0], row[1], row[2], row[3]))
else: break
However, I also tried a different dataset and sorted based just on the mileage, for example:
list2.sort(key = lambda x: x[3]),
instead of
list2.sort(key = lambda x: x[4])
and surprisingly in that particular dataset sorting based just on mileage gave me a list of cars that had more mileage than my 'weighting' algorithm and was still under budget! This must mean my way of solving this problem is flawed, but I can't see why. I would greatly appreciate any suggestions! Thanks.