3

My book says that -

1- Ranking functions (such as RANK(), DENSE_RANK(), ROW_NUMBER() etc.) need an OVER() clause.

So, the code below is wrong -

select *, RANK()
from [grant]
order by Amount desc

error - Incorrect syntax near 'RANK', expected 'OVER'.

If I add OVER() after RANK() in above code, i get an error again. (Error - The ranking function "RANK" must have an ORDER BY clause.)

2- Then, my books adds that "Ranking functions need ORDER BY information to appear as an argument in the OVER()".

select *, RANK() OVER(ORDER BY Amount DESC) as GrantRank
from [grant]

My questions are -

1 - Why do we need to have an OVER() clause with a ranking function ? 2 - Why do we have to remove the order by statement and put it inside the OVER() ?

The book does not explain the logic behind these things ! Please help me to understand it.

Thanks in advance.

sequel.learner
  • 3,421
  • 7
  • 22
  • 24
  • Just a warning: do not expect logic from SQL, especially from its syntax. – Mark Hurd Jan 13 '13 at 03:09
  • @MarkHurd - I am not sure if that is the right way to go about it. I believe that there must be some logic behind these things. Otherwise anything goes, you could put count() inside rank(), over inside add() etc...just saying. – sequel.learner Jan 13 '13 at 03:12
  • @MarkHurd - I was just thinking about what you said. Would you say that if these questions were asked in an interview ? – sequel.learner Jan 13 '13 at 03:18
  • I should have added a smiley :-) SQL _can_ be understood, but a number of its technical issues are a bit arcane. If this sort of thing was asked in an interview, I'd be honest and say I'd need a little trial and error to get it right -- of course at some stage I'd check the manuals (or Google really). – Mark Hurd Jan 13 '13 at 03:30

2 Answers2

4

Why do we need to have an OVER() clause with a ranking function?

The OVER() clause is needed so that SQL Server knows exactly how you want to determine things like RANK(). What RANK() do you expect if you don't supply SQL Server with an ordering criteria? Is the winner of a race the one with the fastest time, the slowest time, or the first name alphabetically?

Why do we have to remove the order by statement and put it inside the OVER()?

You don't need to remove the ORDER BY clause when you add an ORDER BY clause inside the OVER(). These are used independently - one to determine the RANK() and the other to dictate ordering.

So, for example, if you wanted to return the finishers of a race, but order them last place to first place, you might say:

SELECT 
  name, 
  finish_time, 
  [rank] = RANK() OVER (ORDER BY finish_time) -- fastest first
FROM 
  dbo.race_table
ORDER BY 
  finish_time DESC; -- fastest last
Aaron Bertrand
  • 272,866
  • 37
  • 466
  • 490
  • Thanks. It makes sense now. I had a trivial question - why was it named OVER(). So, i looked at an Oracle SQL question - http://stackoverflow.com/questions/1092120/over-clause-in-oracle It says - "The OVER clause specifies the partitioning, ordering & window "over which" the analytic function operates." Is that the answer to my trivial question ? I don't know what partitioning and analytic functions mean. – sequel.learner Jan 13 '13 at 03:39
  • Think of partitioning as "slicing" and analytic as, well, analytic - so you may want to know, for example, the person who had the highest mark in each class. The choice of the word `OVER()` was not a choice Microsoft made - they are adhering to the ANSI standard as much as they can. So you'd have to read the standard or ask the authors for a more accurate explanation than you might find in some of these answers. That said, the "over which" sounds plausible to me. – Aaron Bertrand Jan 13 '13 at 03:42
0

To understand what's going on, you have to understand what the RANK function is doing -- it's returning a ranking of the data based on some column(s). For those that are tied, you get the same ranking. So if you want to know which row in your grant table have the highest amounts, you'd want to order by that particular column descending.

Imagine rankings in any sport -- you can have two teams tied for the first place, and the next team in the rankings would be 3rd. That's essentially what the RANK function is doing for you.

This is a useful article on MSDN.

And here is SQL Fiddle that may help as well.

sgeddes
  • 62,311
  • 6
  • 61
  • 83