2

I have a table Vulnerabilities from which I want to query records ordered by severity (column). Severity column holds value "High", "Medium","low".

The below given query is giving me error:

for SELECT DISTINCT, ORDER BY expressions must appear in select list"

SELECT DISTINCT vuln 
FROM Vulnerabilities vuln 
WHERE (lower(vuln.dsc) LIKE '%tomcat%') 
ORDER BY CASE vuln.severity 
            WHEN 'High' THEN 1 
            WHEN 'Medium' THEN 2 
            WHEN 'Low' THEN 3 
         END ASC
Shruti Rawat
  • 687
  • 6
  • 11
  • 24
  • 3
    *SELECT DISTINCT vuln FROM ... vuln* Is there a column called `vuln` on your table or are you trying to select all the columns of the table? – ta.speot.is Jul 18 '14 at 06:00
  • The suggested duplicate is hardly a good choice. It's for SQL Server and the syntax is not 100 % applicable for Postgres. It also doesn't deal with `SELECT *`. – Erwin Brandstetter Jul 18 '14 at 13:52

1 Answers1

5

Use a subquery:

SELECT *
FROM  (
   SELECT DISTINCT *
   FROM   Vulnerabilities vuln 
   WHERE  lower(dsc) LIKE '%tomcat%'
   ) sub
ORDER  BY CASE severity 
            WHEN 'High'   THEN 1 
            WHEN 'Medium' THEN 2 
            WHEN 'Low'    THEN 3 
          END;

Or make the ORDER BY expression part of the SELECT list:

SELECT DISTINCT
       CASE severity 
           WHEN 'High'   THEN 1 
           WHEN 'Medium' THEN 2 
           WHEN 'Low'    THEN 3 
       END AS severity_order, *
FROM   Vulnerabilities vuln 
WHERE  lower(dsc) LIKE '%tomcat%'
ORDER  BY 1;

But chances are, you don't want DISTINCT * at all. Like @a_horse commented, this only makes sense for complete duplicates, which is a rare case and only possible if you don't have defined any unique columns (like a PK!)

DISTINCT may be redundant noise. Or your example may be a simplification of a more complex query, or you really want to GROUP BY / DISTINCT ON a few selected columns only. Here is a more useful related answer for the latter:

Basics for DISTINCT ON:

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • 1
    `SELECT DISTINCT *` doesn't really makes sense because the `*` will include the PK column(s) and thus by definition there won't be duplicates that can be removed by the `distinct` (but then the `distinct` usage in the question does not really make sense either) –  Jul 18 '14 at 06:35
  • True if there is a PK, but that's just an assumption. This answers the question asked. I added a bit more to address these issues. – Erwin Brandstetter Jul 18 '14 at 13:51
  • The important part is that this works: `select distinct mycol from mytable order by mycol;` but this doesn't: `select distinct mycol from mytable order by upper(mycol);` though these do (thanks @ErwinBrandstetter): `select * from ( select distinct mycol from mytable ) t order by upper(t.mycol);` `select distinct mycol, upper(mycol) from mytable order by upper(mycol);` This is really just Postgres silliness, as there is no good reason why the order by clause should be in the select list in this case and other DBMSs don't require it. – Arnon Weinberg Feb 13 '17 at 06:02
  • 1
    @ArnonWeinberg: One of the algorithms used to implement `DISTINCT` sorts the table according to the `SELECT` and keeps each first row per set of dupes. Sorting the result differently would require another sort step. That's what you do with a subquery and another `ORDER BY` like demonstrated above. Aside: You can *append* any expressions to `ORDER BY` with `DISTINCT ON` - after expressions in the `DISTINCT ON` clause and completely independent of the `SELECT` list. I added a link to my main answer about `DISTINCT ON` above. – Erwin Brandstetter Feb 14 '17 at 03:05