0

I have the following data

app people login
slack Jenny 2019-01-01
slack Jenny 2020-01-01
slack Jenny 2021-01-01
slack Yang 2020-06-01
slack Yang 2021-06-01

For each group app and user, I need to get the latest login date. I tried to use a window function, but could not get the result of what I am expecting:

SELECT app, people, LAST_VALUE(login) OVER (PARTITION BY app, people ORDER BY login)
FROM xxxxx

Any suggestions?

lemon
  • 14,875
  • 6
  • 18
  • 38
yyiws
  • 69
  • 3

2 Answers2

1

To get first, last value from group there is no need for window function, just distinct select. I've assumed that each {appp, people} pair must be presented only once in results.

select distinct on (app,people)
app, people, login from tab
order by app, people, login desc

DB Fiddle

And example with LAST_VALUE

SELECT 
  app, 
  people, 
  LAST_VALUE(login) OVER (
         PARTITION BY app, people 
         ORDER BY login ASC 
         RANGE BETWEEN UNBOUNDED PRECEDING AND 
                UNBOUNDED FOLLOWING)
FROM tab

When ORDER BY is used inside window function then window function run in "running sum" mode, knowing only about rows already processed inside partition. If we want to get information from whole partition then we must explicitly point that.

In this case, using MAX looks like better choice (as @lemon pointed)

SELECT 
   app, 
   people, 
   login,  
   MAX(login) OVER (PARTITION BY app, people)
FROM tab
Kadet
  • 1,344
  • 3
  • 10
  • Thanks for your response, I know your query will work, but why the window function do not do the job? any idea? – yyiws Jul 01 '22 at 20:19
  • It really depends on whether you want to keep the same amount of rows or want to "*aggregate*" the rows. In the former case you need a window function, in the latter one you want an aggregation function. @yyiws – lemon Jul 01 '22 at 20:24
  • @yyiws window function also works, but you need to explicitly show that all rows in partition should be scanned – Kadet Jul 01 '22 at 20:27
  • 1
    At that point use `MAX`, as long as it automatically implies your `RANGE` clause. – lemon Jul 01 '22 at 20:31
0

You can use the MAX window function in place of the LAST_VALUE window function:

SELECT app, 
       people, 
       MAX(login) OVER (PARTITION BY app, people ORDER BY login DESC)
FROM tab

Check the demo here.


Instead, if you want to aggregate your "app" and "people" values on the last login value, you can also use the MAX aggregation function:

SELECT app, 
       people, 
       MAX(login)
FROM tab
GROUP BY app, 
         people

Check the demo here.

lemon
  • 14,875
  • 6
  • 18
  • 38
  • `ORDER` in `OVER` is important to return proper `LAST_VALUE` from a group. Without order it's quite random – Kadet Jul 01 '22 at 20:19
  • You're right. Anyways `LAST_VALUE` really isn't the proper window function to use. – lemon Jul 01 '22 at 20:22