2

I have quite a complex query to essentially select the cheapest delivery service price per delivery service.

In order to get unique records per delivery service, I utilise the DISTINCT function in SQL. This query provides correct results:

DeliveryServicePrice.active.select('DISTINCT ON (delivery_service_id) *').order('delivery_service_id, price ASC')

(only a part of the query)

However, this query only seems to work with PostgreSQL (which I think is strange considering PostgreSQL is a lot more strict with SQL standards); it does not work with MySQL and SQLite. I receive the following error:

Mysql2::Error: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'ON (delivery_service_id) * FROM `delivery_service_prices` WHERE `delivery_servi' at line 1: SELECT DISTINCT ON (delivery_service_id) * FROM `delivery_service_prices` WHERE `delivery_service_prices`.`active` = 1 AND (2808.0 >= min_weight AND 2808.0 <= max_weight AND 104.0 >= min_length AND 104.0 <= max_length AND 104.0 >= min_thickness AND 104.0 <= max_thickness) ORDER BY delivery_service_id, price ASC

The application I'm building is open source, so it's required to support all 3 database types. How do I create DISTINCT ON queries for MySQL and SQLite in the Rails framework syntax?

I'm using Rails 4.1.

Resources

My previous problem for reference:
How to select unique records based on foreign key column in Rails?

File and line number for where the query is being used.

Finished answer

DeliveryServicePrice.select('delivery_service_prices.id').active.joins('LEFT OUTER JOIN delivery_service_prices t2 ON (delivery_service_prices.delivery_service_id = t2.delivery_service_id AND delivery_service_prices.price > t2.price)').where('t2.delivery_service_id IS NULL')
Community
  • 1
  • 1
JellyFishBoy
  • 1,658
  • 1
  • 17
  • 25
  • DISTINCT * FROM delivery_service_id? – Steve Nov 12 '14 at 19:25
  • Did you try: `DeliveryServicePrice.active.distinct(:delivery_service_id).order('delivery_service_id, price ASC')`? – Surya Nov 12 '14 at 19:47
  • This query doesn't return the correct results, it just returns all the records since there is no DISTINCT condition. Translates to: `SELECT DISTINCT 'delivery_service_prices'.* FROM 'delivery_service_prices' WHERE 'delivery_service_prices'.'active' = 1 ORDER BY delivery_service_id, price ASC` – JellyFishBoy Nov 13 '14 at 07:11

2 Answers2

3

DISTINCT ON is a Postgres specific extension to the standard SQL DISTINCT. Neither of them is a "function", both are SQL key words - even though the parentheses required after DISTINCT ON make it look like a function.

There are a couple of techniques to rewrite this with standard-SQL, all of them more verbose, though. Since MySQL does not support window-functions row_number() is out.

Details and more possible query techniques:

Rewritten with NOT EXISTS:

SELECT *
FROM   delivery_service_prices d1
WHERE  active = 1
AND    2808.0 BETWEEN min_weight AND max_weight
AND    104.0  BETWEEN min_length AND max_length
AND    104.0  BETWEEN min_thickness AND max_thickness
AND NOT EXISTS (
   SELECT 1
   FROM   delivery_service_prices d2
   WHERE  active = 1
   AND    2808.0 BETWEEN min_weight AND max_weight
   AND    104.0  BETWEEN min_length AND max_length
   AND    104.0  BETWEEN min_thickness AND max_thickness
   AND    d2.delivery_service_id = d1.delivery_service_id
   AND    d2.price < d1.price 
   AND    d2.<some_unique_id> < d1.<some_unique_id>  -- tiebreaker!
   )
ORDER  BY delivery_service_id
  • If there can be multiple rows with the same price for the same delivery_service_id, you need to add some unique tie-breaker to avoid multiple results per delivery_service_id. At least if you want a perfectly equivalent query. My example would select the row with the smallest <some_unique_id> from each set of dupes.

  • Unlike with DISTINCT ON, ORDER BY is optional here.

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Thanks for your response, your query worked and it discovered a flaw in my delivery service price logic (unique prices per delivery service). I ended up using the following query as it performed quicker: http://stackoverflow.com/questions/121387/fetch-the-row-which-has-the-max-value-for-a-column/123481#123481. It also enabled me to merge the second query, so now I can execute all the arguments in one query. – JellyFishBoy Nov 13 '14 at 07:05
  • @JellyFishBoy: Nice harvest. :) – Erwin Brandstetter Nov 13 '14 at 09:07
2
DeliveryServicePrice.active.select(:delivery_service_id).distinct.order('delivery_service_id, price ASC')
daryn
  • 926
  • 5
  • 11
  • This query doesn't return the correct data, I'm looking to select delivery service price records, not their delivery_service_id attribute. – JellyFishBoy Nov 13 '14 at 07:10