0

I've got the sql query:

SELECT cast(AVG(SNR) AS integer) AS snr,
       cast(AVG(RSSI) AS integer) AS rts
FROM SESSION
WHERE DATE(associationtime)>DATE(NOW()- INTERVAL '21 DAYS');

It works slow, because 21 days contains 300k rows.

 Aggregate  (cost=21768.07..21768.09 rows=1 width=8) (actual time=346.794..346.795 rows=1 loops=1)
   ->  Seq Scan on session  (cost=0.00..20095.77 rows=334459 width=8) (actual time=0.014..282.512 rows=345304 loops=1)
         Filter: (date(associationtime) > date((now() - '21 days'::interval)))
         Rows Removed by Filter: 148508
 Total runtime: 346.867 ms

How can I improve my query? May I create index or something?

UPD:

Index on associationtime doesn't help.

postgres=# CREATE INDEX session_lim_values_idx ON session (associationtime);
CREATE INDEX
postgres=# EXPLAIN (ANALYZE) SELECT cast(AVG(SNR) as integer) as snr, cast(AVG(RSSI) as integer)  as rts FROM session WHERE DATE(associationtime)>DATE(NOW()- INTERVAL '21 DAYS');
                                                      QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=21768.07..21768.09 rows=1 width=8) (actual time=347.654..347.654 rows=1 loops=1)
   ->  Seq Scan on session  (cost=0.00..20095.77 rows=334459 width=8) (actual time=0.014..283.344 rows=345304 loops=1)
         Filter: (date(associationtime) > date((now() - '21 days'::interval)))
         Rows Removed by Filter: 148508
 Total runtime: 347.731 ms

And DATE(associationtime), too:

postgres=# CREATE INDEX session_lim_values_idx ON session (DATE(associationtime));
CREATE INDEX
postgres=# EXPLAIN (ANALYZE) SELECT cast(AVG(SNR) as integer) as snr, cast(AVG(RSSI) as integer)  as rts FROM session WHERE DATE(associationtime)>DATE(NOW()- INTERVAL '21 DAYS');
                                                      QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=21768.07..21768.09 rows=1 width=8) (actual time=341.050..341.050 rows=1 loops=1)
   ->  Seq Scan on session  (cost=0.00..20095.77 rows=334459 width=8) (actual time=0.015..278.247 rows=345304 loops=1)
         Filter: (date(associationtime) > date((now() - '21 days'::interval)))
         Rows Removed by Filter: 148508
 Total runtime: 341.129 ms
Tony
  • 3,605
  • 14
  • 52
  • 84
  • 1
    What data type is `associationtime` –  Jul 14 '14 at 08:24
  • 2
    What portion of the table are you selecting? Whats `count(*) from session` vs `count(*) from session where associationtime > ...` – Angelo Fuchs Jul 14 '14 at 08:50
  • An index on `associationTime` won't help because you're calling a function on it (that has to do some complicated math, no less). It's probably the index you actually want, though (more useful than the one with the function), which means your search criteria would have to be something like `associationTime >= CURRENT_DATE - INTERVAL '20 DAYS'`. – Clockwork-Muse Jul 14 '14 at 08:51
  • @Clockwork-Muse: that's what a function based index is for (and which apparently doesn't work) –  Jul 14 '14 at 08:53
  • @a_horse_with_no_name - I know that, it's just that I don't consider one based on `DATE(...)` to be all that useful (given that it's trivial to query for the timestamp range). – Clockwork-Muse Jul 14 '14 at 09:04

3 Answers3

3

As you only care about whole days you may want to cache the result in a materialized view.

CREATE MATERIALIZED VIEW matview_avg_session
AS SELECT cast(AVG(SNR) AS integer) AS snr,
   cast(AVG(RSSI) AS integer) AS rts
   FROM SESSION
   WHERE DATE(associationtime) > DATE(NOW()- INTERVAL '21 DAYS');

then access the data like this:

SELECT * FROM matview_avg_session;

and refresh it (automatically once per day) like this:

REFRESH MATERIALIZED VIEW matview_avg_session;

or you look at this answer on how to create a trigger for refreshing it, but keep in mind that you don't want to do this after EVERY insert... Refresh a materialized view automatically using a rule or notify

Community
  • 1
  • 1
Angelo Fuchs
  • 9,825
  • 1
  • 35
  • 72
1

You could use a covering index, i.e., to force the executor using a index-only scan.

For covering index you add columns used in where clauses first then columns used in group by, then columns used in order by and then columns used in select.

  ALTER TABLE session ADD KEY ix1(date(associationtime), <remaining_columns>);

where <remaining_columns> are those, you would write in the group by clause in the SFW statement.

paubo147
  • 748
  • 4
  • 8
1

remove the date conversion from associationtime so the index can catch on.

SELECT cast(AVG(SNR) AS integer) AS snr,
       cast(AVG(RSSI) AS integer) AS rts
FROM SESSION
WHERE associationtime > DATE(NOW() - INTERVAL '20 DAYS');

If that doesn't help do a VACUUM ANALYZE on it, then try again.

Angelo Fuchs
  • 9,825
  • 1
  • 35
  • 72
  • You're missing midnight of the given date.... or rather, you're counting most of an extra day (everything but midnight) that wasn't included in the original set. – Clockwork-Muse Jul 14 '14 at 09:01
  • @Clockwork-Muse You are correct, I fixed the "most of an extra day". The loss of midnight is correct, but with 300k entries that shouldn't make that much of a difference. – Angelo Fuchs Jul 14 '14 at 10:00