Order by only one, of two distinct columns

Question

I am selecting ColA and ColB in the same statement:

select distinct ColA, ColB, EDITABLE from TableA;

I would like to use ORDER BY lower(ColA), because I would like to sort ColA alphabetically, without taking capitalization into consideration. ColB does not need to be affect in this sorting and I would like to only sort ColA (which needs to be distinct because there are many instances of the same value in ColA).

I tried

select distinct ColA, ColB, EDITABLE
from TableA
ORDER BY lower(ColA)

and also saw this question concerning distinct and order by but when I tried

select distinct ColA, ColB, lower(ColA), EDITABLE
from TableA
GROUP BY ColA
ORDER BY lower(ColA) ASC, ColA

I was unable to run the statement above. I am very new at SQL and would love some tips on why this didn't work and help on what I can improve this statement on

"I was unable to run the statement above" Did you get a error? — Raymond Nijland, Mar 04 '19 at 14:21
Yes - running it Oracle SQL Developer results in: "ORA-00979: not a GROUP BY expression" — Jett, Mar 04 '19 at 14:27
"there are many instances of the same value in ColA" - are there many instances of the *combination* of ColA, ColB and editable? Distinct applies to the whole row in the result set, not a single column. — Alex Poole, Mar 04 '19 at 14:32
@AlexPoole it feels like the topicstarter wants to do `DISTINCT ON(ColA), ColB, lower(ColA), EDITABLE` (PostgreSQL syntax) and get unique `ColA` values for `ColB, lower(ColA), EDITABLE` columns. Topicstarter see [this](https://stackoverflow.com/questions/10515391/oracle-equivalent-of-postgres-distinct-on) — Raymond Nijland, Mar 04 '19 at 14:37
Possibly, yes. Sample data and expected results would help, of course... — Alex Poole, Mar 04 '19 at 14:40
Why do you want to use the `group by` approach instead of the `distinct` approach anyway? You aren't doing any aggregation, and you'll get the same result if you group by all of the columns you're selecting - but that's more code and it's less clear what the intent is. Having to suppress duplicates at all might suggest a problem with your data, but if you have a legitimate reason, what's wrong with the `distinct` keyword - that's what it's for? — Alex Poole, Mar 04 '19 at 18:13

score 7 · Accepted Answer · answered Mar 04 '19 at 14:24

7

Mention all the columns in the Group By that you are Selecting

In SQLServer, it is like:

select distinct ColA, ColB, lower(ColA)
from TableA
GROUP BY ColA, ColB, lower(ColA)
ORDER BY lower(ColA) ASC

answered Mar 04 '19 at 14:24

SU7

1,586
1
18
26

Most likely this will give a different result which the topicstarter is after. – Raymond Nijland Mar 04 '19 at 14:34
1

The `distinct` keyword is redundant - the group-by has the same effect. – Alex Poole Mar 04 '19 at 14:47
Thanks! Worked but I made the statement a bit shorter and it did the same trick for me `select ColA, ColB, EDITABLE from TableA GROUP BY ColA, ColB, EDITABLE ORDER BY lower(ColA) ASC` – Jett Mar 04 '19 at 15:23
My answer was the first one. Still not accepted as 'answer' @Jedo :'( nvm, happy coding :) – SU7 Mar 04 '19 at 16:00
1

@Jedo - you don't have to accept the first answer; [accept the answer that you believe is the best solution to your problem](https://stackoverflow.com/help/someone-answers). – Alex Poole Mar 04 '19 at 16:09
I don't like this answer. What is the `GROUP BY` for? You can remove it; it makes no sense. – Thorsten Kettner Mar 04 '19 at 17:28
@ThorstenKettner Please read Jedo's above comment. It solved his problem :) – SU7 Mar 04 '19 at 18:10
@AlexPoole Please read Jedo's above comment. It solved his problem :) – SU7 Mar 04 '19 at 18:11
1

Your query applies `DISTINCT` on the rows. The `GROUP BY` does exactly the same, only that it suggests that you wanted to apply some aggregation (`MIN`, `MAX`, `SUM`, `COUNT`, whatever). You don't. You even have exactly the same columns in `GROUP BY` and after `DISTINCT` which indicates that it cannot remove any duplicates from the list. So all in all, you are suggesting a query that looks faulty, because of superfluous and inappropriate elements. Your proposed query is not good and your explanation of how to use `GROUP BY` is misleading. Hence my downvote. – Thorsten Kettner Mar 04 '19 at 18:15
1

Sort of - after some editing. Getting the desired output - and this doesn't quite, from Jedo's comment - doesn't make it completely right, or the best approach. The first sentence sort of covers the issue - at least the error Jedo was getting - but the fix isn't ideal. My issue though was with your suggestion in a comment that this should be accepted because it was first. It's up to the question asker which answer (if any) they accept, being first shouldn't be a significant factor. – Alex Poole Mar 04 '19 at 18:19

Gordon Linoff · Answer 2 · 2019-03-04T16:58:34.680

2

Your referenced question does not apply to your question. In that question, there is another column used for ordering, one not in the original select distinct. In your case, the order by is using a function on one of the original columns. There should be no need to repeat the expression in the SELECT.

You can also use group by:

select ColA, ColB, EDITABLE
from TableA
group by ColA, ColB, EDITABLE
order by lower(ColA);

edited Mar 04 '19 at 16:58

answered Mar 04 '19 at 15:15

Gordon Linoff

1,242,037
58
646
786

I don't understand why this is the accepted answer. Isn't `EDITABLE` a column of `TableA`? Then we'd have to use an aggregate function on it, but this is not requested. So we'd put it in `GROUP BY` instead, but then we can remove the whole `GROUP BY` clause, add `DISTINCT` and end up exactly with the query Jedo has already shown in their request. Am I overlooking something? – Thorsten Kettner Mar 04 '19 at 15:52

score 2 · Answer 3 · answered Mar 04 '19 at 15:39

This is your query:

select distinct ColA, ColB, lower(ColA), EDITABLE
from TableA
GROUP BY ColA
ORDER BY lower(ColA) ASC, ColA

And this is what it does:

1.   FROM clause: select rows from TableA.
2.   GROUP BY clause: Aggregate rows so as to get one row per ColA.
3.   SELECT clause:
3.1.   Show ColA. This is okay, ColA is what you group by.
3.2.   Show ColB. Which? There can be diferent ColB per ColA. This produces an error.
3.3.   Show lower(ColA). That is okay. You group by ColA, so lower(ColA) is also known.
3.4.   Show EDITABLE. Again: which? There can be diferent EDITABLE per ColA.
3.5.   DISTINCT: remove duplicate rows. This is superfluous, because there cannot be
       duplicate rows. They all have different ColA.
4.   ORDER BY clause:
4.1.   Sort by lower(ColA), so you have 'John' and 'john' together.
4.2.   Sort by ColA. This tells the DBMS whther to put 'John' or 'john' first.

I hope this explains how a query gets executed, what GROUP BY does and what is allowed in the SELECT clause. DISTINCT is very often a sign for a badly written query. Here it's merely superfluous, but quite often it is used as some defense against poor joins leading to duplicate rows. Whenever you see SELECT DISTINCT ask yourself what makes it necessary.

Order by only one, of two distinct columns

3 Answers3