159

After executing the following statement:

SELECT  Category  FROM MonitoringJob ORDER BY CreationDate DESC

I am getting the following values from the database:

test3
test3
bildung
test4
test3
test2
test1

but I want the duplicates removed, like this:

bildung
test4
test3
test2
test1

I tried to use DISTINCT but it doesn't work with ORDER BY in one statement. Please help.

Important:

  1. I tried it with:

    SELECT DISTINCT Category FROM MonitoringJob ORDER BY CreationDate DESC
    

    it doesn't work.

  2. Order by CreationDate is very important.

Pang
  • 9,564
  • 146
  • 81
  • 122
r.r
  • 7,023
  • 28
  • 87
  • 129

13 Answers13

269

The problem is that the columns used in the ORDER BY aren't specified in the DISTINCT. To do this, you need to use an aggregate function to sort on, and use a GROUP BY to make the DISTINCT work.

Try something like this:

SELECT DISTINCT Category, MAX(CreationDate) 
FROM MonitoringJob 
GROUP BY Category 
ORDER BY MAX(CreationDate) DESC, Category
Prutswonder
  • 9,894
  • 3
  • 27
  • 39
  • 132
    You don't even need the DISTINCT keyword if you are grouping by Category. – MatBailie Mar 22 '11 at 13:11
  • 2
    For an explanation of why you would need an aggregate function (in your ORDER BY, not necessarily in your SELECT!), I found a neat article: https://weblogs.sqlteam.com/jeffs/2007/12/13/select-distinct-order-by-error/ TLDR: When using MIN() or MAX() for each row after the GROUP BY, it grabs the highest or lowest value in the group. – Ken Nov 13 '20 at 12:17
38

Extended sort key columns

The reason why what you want to do doesn't work is because of the logical order of operations in SQL, as I've elaborated in this blog post, which, for your first query, is (simplified):

  • FROM MonitoringJob
  • SELECT Category, CreationDate i.e. add a so called extended sort key column
  • ORDER BY CreationDate DESC
  • SELECT Category i.e. remove the extended sort key column again from the result.

So, thanks to the SQL standard extended sort key column feature, it is totally possible to order by something that is not in the SELECT clause, because it is being temporarily added to it behind the scenes.

So, why doesn't this work with DISTINCT?

If we add the DISTINCT operation, it would be added between SELECT and ORDER BY:

  • FROM MonitoringJob
  • SELECT Category, CreationDate
  • DISTINCT
  • ORDER BY CreationDate DESC
  • SELECT Category

But now, with the extended sort key column CreationDate, the semantics of the DISTINCT operation has been changed, so the result will no longer be the same. This is not what we want, so both the SQL standard, and all reasonable databases forbid this usage.

Workarounds

It can be emulated with standard syntax as follows

SELECT Category
FROM (
  SELECT Category, MAX(CreationDate) AS CreationDate
  FROM MonitoringJob
  GROUP BY Category
) t
ORDER BY CreationDate DESC

Or, just simply (in this case), as shown also by Prutswonder

SELECT Category, MAX(CreationDate) AS CreationDate
FROM MonitoringJob
GROUP BY Category
ORDER BY CreationDate DESC

I have blogged about SQL DISTINCT and ORDER BY more in detail here.

Lukas Eder
  • 211,314
  • 129
  • 689
  • 1,509
  • 1
    I think you're mistaken with how `DISTINCT ON` works and pretty sure it doesn't help here. The expression in parentheses is what is used to determine distinctness (the grouping condition). If there are different categories with the same `CreationDate` then only one of them will appear in the result! Since I was wondering if maybe I was wrong somehow, I also loaded the example database in your blog post to double-check: the `DISTINCT ON` query you gave there produced a total of 1000 results (with plenty of duplicate `length`s) while the query below it gave only 140 (unique) values. – Inkling May 16 '19 at 06:35
  • 1
    @Inkling: Thanks for your time. The OP explicitly wants "duplicates" removed. See OP's wording *"but I want the duplicates removed, like this"*. You probably made a mistake when copying the queries from my blog post. There are two queries, one that uses `DISTINCT` (no `ON`) and one that uses `DISTINCT ON`. Please observe that the latter explicitly does not remove duplicate lengths, but duplicate titles. I do think that my answer here is entirely correct. – Lukas Eder May 16 '19 at 07:19
  • 1
    My point is that your `DISTINCT ON` conditions are removing duplicates using the wrong condition. In your blog post the `DISTINCT ON` query does indeed remove duplicate *titles*, however the `DISTINCT` query above it and the query below it (that you claim it is "syntax sugar" for) both remove duplicate *lengths*, as that's presumably the whole goal. The same thing applies here: the OP wants duplicate *Categories* removed, not duplicate *CreationDates* as the `DISTINCT ON` query does. If you still don't believe me, test for yourself. – Inkling May 17 '19 at 08:04
9

If the output of MAX(CreationDate) is not wanted - like in the example of the original question - the only answer is the second statement of Prashant Gupta's answer:

SELECT [Category] FROM [MonitoringJob] 
GROUP BY [Category] ORDER BY MAX([CreationDate]) DESC

Explanation: you can't use the ORDER BY clause in an inline function, so the statement in the answer of Prutswonder is not useable in this case, you can't put an outer select around it and discard the MAX(CreationDate) part.

halfer
  • 19,824
  • 17
  • 99
  • 186
Marc_Sei
  • 173
  • 2
  • 7
5

Just use this code, If you want values of [Category] and [CreationDate] columns

SELECT [Category], MAX([CreationDate]) FROM [MonitoringJob] 
             GROUP BY [Category] ORDER BY MAX([CreationDate]) DESC

Or use this code, If you want only values of [Category] column.

SELECT [Category] FROM [MonitoringJob] 
GROUP BY [Category] ORDER BY MAX([CreationDate]) DESC

You'll have all the distinct records what ever you want.

Prashant Gupta
  • 641
  • 9
  • 10
  • those braces [] are totally confusing... is this valid SQL syntax? – m13r Aug 27 '14 at 11:40
  • 2
    The brackets are for escaping keywords, such as Order, event, etc. so if you have (for example) a column in your table called `Event` you can write `[Event]` instead of `Event` to stop SQL throwing a parse error. – Ben Maxfield Oct 24 '16 at 21:26
2

2) Order by CreationDate is very important

The original results indicated that "test3" had multiple results...

It's very easy to start using MAX all the time to remove duplicates in Group By's... and forget or ignore what the underlying question is...

The OP presumably realised that using MAX was giving him the last "created" and using MIN would give the first "created"...

JohnSurrey
  • 59
  • 2
  • 4
    This doesn't really seem to answer the question, it seems to be a comment on other answerer's uses of `MAX`, rather than something that stands-alone as an answer to the question. – DaveyDaveDave Jun 01 '18 at 07:42
1
if object_id ('tempdb..#tempreport') is not null
begin  
drop table #tempreport
end 
create table #tempreport (
Category  nvarchar(510),
CreationDate smallint )
insert into #tempreport 
select distinct Category from MonitoringJob (nolock) 
select * from #tempreport  ORDER BY CreationDate DESC
Serg Chernata
  • 12,280
  • 6
  • 32
  • 50
Bob
  • 11
  • 1
-1

You can use CTE:

WITH DistinctMonitoringJob AS (
    SELECT DISTINCT Category Distinct_Category FROM MonitoringJob 
)

SELECT Distinct_Category 
FROM DistinctMonitoringJob 
ORDER BY Distinct_Category DESC
Renats Stozkovs
  • 2,549
  • 10
  • 22
  • 26
Jair
  • 1
-1

Distinct will sort records in ascending order. If you want to sort in desc order use:

SELECT DISTINCT Category
FROM MonitoringJob
ORDER BY Category DESC

If you want to sort records based on CreationDate field then this field must be in the select statement:

SELECT DISTINCT Category, creationDate
FROM MonitoringJob
ORDER BY CreationDate DESC
m13r
  • 2,458
  • 2
  • 29
  • 39
C Patel
  • 31
  • 1
  • 12
    This will execute but won't give what the OP needs. The OP wants distinct Categories, not distinct combinations of Category and CreateDate. This code can yield several instances of the same Category, each with different CreationDate values. – MatBailie Mar 22 '11 at 13:16
-1

By subquery, it should work:

    SELECT distinct(Category) from MonitoringJob  where Category in(select Category from MonitoringJob order by CreationDate desc);
Channa
  • 742
  • 17
  • 28
-1

We can do this with select sub query

Here is the the query:

SELECT * FROM (
    SELECT DISTINCT Category FROM MonitoringJob
) AS Tbl
ORDER BY Tbl.CreationDate DESC 
Jitendra Suthar
  • 2,111
  • 2
  • 16
  • 22
  • this would fail, because you don't have access to the creationdate any more after selecting only category – beatrice Sep 28 '22 at 09:36
-3

Try next, but it's not useful for huge data...

SELECT DISTINCT Cat FROM (
  SELECT Category as Cat FROM MonitoringJob ORDER BY CreationDate DESC
);
  • 4
    "The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP or FOR XML is also specified." – TechplexEngineer Jul 01 '13 at 12:57
  • This doesn't work because you don't specified the column CreationDate on the order by. – Mauro Bilotti Mar 19 '14 at 15:12
  • 1
    @TechplexEngineer Your comment is incorrect. Using `ORDER BY` in sub-queries is absolutely valid. And somebody even up-voted your incorrect comment. – Racil Hilan Mar 31 '14 at 18:05
  • I'm trying this and having same error with @TechplexEngineer. I'm using a custom ordering with case when. – Ege Bayrak Sep 08 '16 at 11:37
-4

It can be done using inner query Like this

$query = "SELECT * 
            FROM (SELECT Category  
                FROM currency_rates                 
                ORDER BY id DESC) as rows               
            GROUP BY currency";
Zaheer Babar
  • 1,636
  • 1
  • 15
  • 17
-7
SELECT DISTINCT Category FROM MonitoringJob ORDER BY Category ASC
Furicane
  • 1,173
  • 6
  • 18
  • 2
    i need it sorted by creation date!! it very important – r.r Mar 22 '11 at 12:59
  • So is it impossible to add the column you want to order by yourself? Your example showed entries ordered alphabetically. If you need to order by date created, just add it. It's really not that difficult. – Furicane Mar 22 '11 at 13:02
  • 8
    -1 : The OP tried that, it didn't work, because it's impossible and you apparently ignored that fact when patronising the OP. The point is that the DISTINCT operator will collate several records with the same Category value, each with potentially different createion dates. Thus it is logically impossible when using DISTINCT. This pushes the required logic to a GROUP BY instead of DISTINCT, allowing an aggregate (MAX) on the creation date. – MatBailie Mar 22 '11 at 13:13
  • Actually, if you take a closer look at what the OP did, which is absolutely malformed SQL - I hadn't make a single mistake and the result given corresponds to the one he requested. I won't bother to -1, just read next time before correcting people. Thank you. – Furicane Mar 22 '11 at 13:16
  • 9
    You directly suggest adding the CreationDate field, even saying "it's really not that difficult". Doing so yields the malformed SQL. You got -1 for patronising the OP, giving advice that takes the OP back to the statement he originally posted, and failing to notice the contention between DISTINCT and ordering by a field not in the DISTINCT. Additionally, 'b' comes before 't', and '1' comes before '4', thus the results given by the OP are categorically not in alphabetical order. May I suggest your own advice then : read (more carefully) next time. – MatBailie Mar 22 '11 at 13:20