0

I have a question whether RANK () OVER function can give the 'most popular' value?

Question:

What route was the most popular of those with an average trip time of at least 8 minutes, excluding trips longer than 2 hours?

Can someone confirm whether RANK() OVER satisfies this 'popularity' value?

Davidskis
  • 19
  • 5

2 Answers2

0

rank should be OK; though, I understood the question a little bit differently than you: first restrict all trips to those shorter than 2 hours, and then calculate the rest of data you need. Something like this:

select start_station, 
       end_station,
       rank() over (order by cnt desc) rnk
from 
  (select start_station, 
          end_station,
          avg(duration_seconds) avg_duration,
          count(*) cnt
   from tutorial.dc_bikeshare_q1_2012
   where duration_seconds < 7200
   group by start_station, end_station
  ) as x
where avg_duration >= 480;
Littlefoot
  • 131,892
  • 15
  • 35
  • 57
0

The question asks for the one most popular route. rank() would be a waste of time. ORDER BY + LIMIT 1 should be cheaper:

SELECT start_station, end_station, count(*) AS ct
FROM   tutorial.dc_bikeshare_q1_2012
WHERE  duration_seconds <= 7200
GROUP  BY start_station, end_station
HAVING avg(duration_seconds) >= 480
ORDER  BY ct DESC  -- add more expressions to break ties
LIMIT  1;

You are not the only one working on this:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228