I am going through the pain of normalising a horrible legacy database, and have discovered what I think is a bug with the DBMS.
This query returns the results as I expect:
SELECT DISTINCT RIGHT(SQUEEZE(thing_id), 2) AS thing_id, TRIM(thing_name)
FROM thing
ORDER BY thing_id, thing_name;
(16 rows)
The first time I ran the query, I had inadvertently used the wrong columns in the ORDER BY, as below:
SELECT DISTINCT RIGHT(SQUEEZE(thing_id), 2) AS thing_id, TRIM(thing_name)
FROM thing
ORDER BY thing_name, location;
(33 rows)
Note that the only thing to change is the ORDER BY, and the number of rows returned increases from 16 to 33. The results it gives are not DISTINCT as the query specifies.
I believe this is an outright bug, but a coworker says that it is normal because when we order by "location", it is selected an included invisibly in the results.
Should the ORDER BY ever affect the number of rows returned in a SELECT query?
EDIT: I had another person look at the queries AND I copied the queries to two seperate files, then ran a diff command on them. It is 100% certain that the only difference between the two queries is the columns listed in the ORDER BY.
UPDATE: Ingres have since release patch 14301 with bugfix: "Bug 126640 (GENERIC) Query with order-by expression and distinct aggregate returns more rows than expected. The columns in the order-by expression are not in the select list."
i.e. The query in question will now result in an error as the results are not correct.