0

I am trying determine optimal solution based on two variables(columns) being time and cost. I need to decide which Shipment provider is most optimal based on delivery time and delivery cost.

I would like to know if there is any way to do it in SQL or Python(Pandas). The values are something like:

delivery time delivery cost
"1 day 24:55:38.765958" 11.3057446808510638
"2 days 13:44:12.084918" 8.6606336633663366
"2 days 21:47:49" 13.0000000000000000
"2 days 28:21:07.42914" 35.5322866894197952
Gordan84
  • 35
  • 4
  • What is your definition of optimality? – Harry Haller Jan 28 '23 at 20:22
  • To be honest I am not sure either. I am still brain storm how to find best combination of two variables, making delivery time important but still taking into consideration cost of transport. – Gordan84 Jan 28 '23 at 22:31

1 Answers1

1

My understanding of pareto-optimal is that you want every row which has no rows which are better than it in both dimensions. That translates pretty readily into SQL:

select * from foo a where not exists (
    select 1 from foo b where b.delivery_time<a.delivery_time and b.delivery_cost < a.delivery_cost
);

I'm not sure what you should do in case of ties, though.

jjanes
  • 37,812
  • 5
  • 27
  • 34
  • To be honest I am still brain storm how to find best combination of two variables, making delivery time important but still taking into consideration cost of transport. I will up vote your answer for the effort and I thank you for it. I think that my final answer will be less scientific and more qualitative. – Gordan84 Jan 28 '23 at 22:33