Add an index to a timestamp with time zone

Question

I want to improve this slow query, I thinking to add an index, but I don't know what index type is better to my case.

SELECT COUNT(*) ct FROM events
WHERE dtt AT TIME ZONE 'America/Santiago'
   >= date(now() AT TIME ZONE 'America/Santiago') + interval '1s'

Query Plan:

"Aggregate  (cost=128032.03..128032.04 rows=1 width=0) (actual time=3929.083..3929.083 rows=1 loops=1)"
"  ->  Seq Scan on events  (cost=0.00..125937.68 rows=837742 width=0) (actual time=113.080..3926.972 rows=25849 loops=1)"
"        Filter: (timezone('America/Santiago'::text, dtt) >= (date(timezone('America/Santiago'::text, now())) + '00:00:01'::interval))"
"        Rows Removed by Filter: 2487386"
"Planning time: 0.179 ms"
"Execution time: 3929.136 ms"

The query gets the count of events of the day.
dtt is a timestamp with time zone column.
I'm using Postgresql 9.4.

Note: With the Erwin advices the query run a little faster but still I think isn't fast enough.

"Aggregate  (cost=119667.76..119667.77 rows=1 width=0) (actual time=3687.151..3687.152 rows=1 loops=1)"
"  ->  Seq Scan on vehicle_events  (cost=0.00..119667.14 rows=250 width=0) (actual time=104.635..3687.068 rows=469 loops=1)"
"        Filter: (dtt >= timezone('America/Santiago'::text, date_trunc('day'::text, timezone('America/Santiago'::text, now()))))"
"        Rows Removed by Filter: 2513337"
"Planning time: 0.164 ms"
"Execution time: 3687.204 ms"

Nitpicking, your query does ***not*** "get the count of events of the day". It gets the count of *all* events since the beginning of the day - including future timestamps if there are some, which might easily happen with multiple time zones. — Erwin Brandstetter, Aug 17 '15 at 04:44

Erwin Brandstetter · Accepted Answer · 2023-02-14T08:56:42.263

11

First, fix your query to make the predicate "sargable":

SELECT count(*) AS ct
FROM   events
WHERE  dtt >= date_trunc('day', now() AT TIME ZONE 'America/Santiago')
                                      AT TIME ZONE 'America/Santiago';

Use the column value as is and move all calculations to the parameter.

That's right, after truncating to the local start of the day, apply AT TIME ZONE a second time to convert the timestamp value back to timestamptz again. See:

Ignoring time zones altogether in Rails and PostgreSQL

Explanation step-by-step

now()
.. is the Postgres implementation for the SQL standard CURRENT_TIMESTAMP. Both are 100 % equivalent, you can use either. It returns the current point in time as timestamptz - the display of the value takes the time zone of the current session into consideration, but that's irrelevant for the value.
now() AT TIME ZONE 'America/Santiago'
.. computes the local time for the given time zone. The resulting data type is timestamp. We do this to allow for:
date_trunc( now() AT TIME ZONE 'America/Santiago' )
.. truncates the time component to get the local start of the day in 'America/Santiago', independent of the current time zone setting.
date_trunc('day', now() AT TIME ZONE 'America/Santiago') AT TIME ZONE 'America/Santiago'
.. feeding the timestamp to the AT TIME ZONE construct we get the corresponding timestamptz value (UTC internally) to compare the timestamptz value dtt to.

I removed the + interval '1s', suspecting you have just been abusing that to convert the date to timestamp. Use date_trunc() instead to produce a timestamp value.

Now, a plain (default) btree index on dtt will do. Of course, the index will only be used, if the predicate is selective enough.

CREATE INDEX events_dtt_idx ON events (dtt);

If your important queries only consider recent rows, a partial index might help some more. Details:

Get latest child per parent from big table - query is too slow

edited Feb 14 '23 at 08:56

answered Aug 17 '15 at 04:17

Erwin Brandstetter

605,456
145
1,078
1,228

Thanks Erwin for your answer, I have now a better result, but the query still is slow, is needed do a reindex to my table? – Goku Aug 17 '15 at 04:30
@Goku: `REINDEX` is only necessary if the index is terribly bloated or corrupt. `VACUUM ANALYZE events` might help to give you an [index-only scan](https://wiki.postgresql.org/wiki/Index-only_scans). If you can't get index-only scans due to too much writing activity, clustering the table might make it faster. [Details here](http://stackoverflow.com/questions/13998139/optimize-postgres-timestamp-query-range/14007963#14007963) – Erwin Brandstetter Aug 17 '15 at 04:39
Did you forget to convert back to UTC? Or does that happen implicitly in the comparison? – Matt Johnson-Pint Aug 17 '15 at 17:43
@Matt: Neither. "Converting to UTC" is an implicit detail of converting to `timestamptz` - which is a UTC timestamp internally and works independent of time zones. Don't be confused by the somewhat misleading data type name: http://stackoverflow.com/a/28876266/939860. Follow the links for details. – Erwin Brandstetter Aug 17 '15 at 18:12
Ok, so `date_trunc` still returns a `timestamptz`. It's truncated to the start of the local date of the specified time zone, but it still internally tracks the UTC point in time - correct? Does the truncation properly deal with time zones that skip over midnight on a DST transition day? (Example, `America/Sao_Paulo` started the day at 1:00 AM on Oct 19, 2014.) – Matt Johnson-Pint Aug 17 '15 at 20:53
1

@MattJohnson: No. `date_trunc()` returns `timestamp` for `timestamp` input and `timestamptz` for `timestamptz`. I added an explanation step-by-step to clarify. – Erwin Brandstetter Aug 17 '15 at 22:32
Moving the `AT TIME ZONE` part from the column to the parameter improved the time of my query from 11 seconds to 4 ms(!). Thank you for your answer! – Wim Deblauwe Feb 14 '23 at 06:42

Add an index to a timestamp with time zone

1 Answers1

Explanation step-by-step