I understand you want:
the row with the earliest time of the latest day for every user
Table design
For starters: drop the Date
column. Storing it redundantly adds more cost and complications than it's worth:
CREATE TABLE userdata (
user_id int
, datetime timestamp
, input text
);
input
should really be some cheap enumeration implementation (enum
, FK, ...).
timestamptz
might be the appropriate type for datetime
. Depends. See:
Index
Either way, to make your operation fast, this is the perfect index:
CREATE INDEX userdata_special_idx ON userdata
(user_id, (datetime::date) DESC NULLS LAST, datetime);
datetime::date
is a very cheap cast, replacing your redundant date column. I still add the date to the multicolumn expression index for performance. (The date depends on the time zone when working with timestamptz
. If you deal with multiple time zones you need to do more.)
Note the added NULLS LAST
: since nothing in your question says that the the timestamp is NOT NULL
, you need this in the query to prevent nonsense results - and the index must match for best results. See:
Query
For only few rows per user, DISTINCT ON
should be the best choice (like GMB already suggested) - simple and fast:
SELECT DISTINCT ON (user_id)
user_id, datetime, input
FROM userdata
ORDER BY user_id, datetime::date DESC NULLS LAST, datetime;
See:
For many rows per user, this alternative query should be (substantially) faster:
SELECT u.user_id, d.*
FROM users u
LEFT JOIN LATERAL (
SELECT d.datetime, d.input
FROM userdata d
WHERE d.user_id = u.user_id -- lateral reference
ORDER BY d.datetime::date DESC NULLS LAST, d.datetime
LIMIT 1
) d ON true;
Typically, this is the way to go for your scenario.
Note the LEFT JOIN
: it returns a row for every user, even with no entries in userdata
. If that's not desirable, use CROSS JOIN
instead. Related:
This assumes the existence of a users
table, which typically exists. If it doesn't, I suggest you add it (for multiple reasons). If that's not an option, there are still fast workarounds. See:
db<>fiddle here
Aside: I warmly recommend to always use ISO date format (as does the manual). Regional format depends on settings of the current session and may fail in ugly ways.