I know that the query below causes the error - ORDER BY items must appear in the select list if SELECT DISTINCT is specified.
SELECT DISTINCT city
FROM HR.Employees
WHERE country = N'USA' AND region = N'WA'
ORDER BY birthdate;
What is the reason for this ? What is actually happening ? If I don't use DISTINCT or just add birthdate in SELECT or ORDER BY city only, it gives me some output, but not an error. Is it because SELECT DISTINCT city
only gives a result set with cities and nothing else ?
EDIT -(I think this may be an answer to my question)
SELECT city
FROM HR.Employees
WHERE country = N'USA' AND region = N'WA'
Consider the above query. Not sure, but I think this is how it works - Behind the scenes, SQL Server probably has a result set with all columns, but we are only shown the city column which I will call the "displayed set" . So, ordering by city or any other column is valid.
What happens to the result set when we use SELECT DISTINCT city instead ? SQL server does not have only the rows with DISTINCT cities in its result set. It has the entire result set, like the one generated by a SELECT * query. But, it shows only distinct rows based on city. Now can this displayed set be sorted based on birthdate ? No.
In a city with many employees, ie many birthdates, SQL server cannot tell which birthdate should be used for ordering the displayed set. That is why it displays an error message.