1

I have a table like this in a PostgreSQL database:

name, salary, birthdate, last_update
1, Ann Alden, 123000.00, 1986-03-04, 2022-01-04 11:30:30.0
3, Chloe Cat, 99999999.99, 1999-01-15, 2022-01-04 12:25:45.0
4, Don Denton, 91234.24, 2004-08-03, 2022-01-04 12:45:00.0
5, Eddy Edwards, 55125125.25, 2003-05-17, 2022-01-05 23:00:00.0

How would I return all fields of a pair of people where the last_update field is less than an hour apart?

I would expect the output of:

id, name, salary, birthdate, last_update, id, name, salary, birthdate, last_update
1, Ann Alden, 123000.00, 1986-03-04, 2022-01-04 11:30:30.0, 3, Chloe Cat, 99999999.99, 1999-01-15, 2022-01-04 12:25:45.0
3, Chloe Cat, 99999999.99, 1999-01-15, 2022-01-04 12:25:45.0, 4, Don Denton, 91234.24, 2004-08-03, 2022-01-04 12:45:00.0"
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228

1 Answers1

0

For only consecutive pairs, use one of the window functions lead() or lag():

SELECT (t1).*, (t2).*
FROM  (
   SELECT t AS t1, lead(t) OVER (ORDER BY last_update, id) AS t2
   FROM   tbl AS t
   ) sub
WHERE  (t1).last_update > (t2).last_update - interval '1 hour';

db<>fiddle here

The query only considers consecutive rows (according to last_update). Else it actually gets simpler but more expensive, and you get many pairs for clusters of updates. (See below.)

Also, while last_update is not defined UNIQUE, there is ambiguity in the sort order. I use id as tiebreaker. (You did not specify.)

Since you ask for whole rows, I use row types (composite values) to keep it simple. All parentheses are required.

The result has duplicate column names like you requested. That's allowed in Postgres, but I wouldn't do it as it introduces ambiguity that many clients can't handle.

Related:

To get all pairs, not just consecutive ones:

SELECT *
FROM   tbl t1
JOIN   tbl t2 ON t1.last_update < t2.last_update
             AND t1.last_update > t2.last_update - interval '1 hour';
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228