9

I've got a query:

EXPLAIN ANALYZE
SELECT CAST(DATE(associationtime) AS text) AS date ,
       cast(SUM(extract(epoch
                        FROM disassociationtime) - extract(epoch
                                                           FROM associationtime)) AS bigint) AS sessionduration,
       cast(SUM(tx) AS bigint)AS tx,
       cast(SUM(rx) AS bigint) AS rx,
       cast(SUM(dataRetries) AS bigint) AS DATA,
       cast(SUM(rtsRetries) AS bigint) AS rts,
       count(*)
FROM SESSION
WHERE ssid_id=42
  AND ap_id=1731
  AND DATE(associationtime)>=DATE('Tue Nov 04 00:00:00 MSK 2014')
  AND DATE(associationtime)<=DATE('Thu Nov 20 00:00:00 MSK 2014')
GROUP BY(DATE(associationtime))
ORDER BY DATE(associationtime);

The output is:

 GroupAggregate  (cost=0.44..17710.66 rows=1 width=32) (actual time=4.501..78.880 rows=17 loops=1)
   ->  Index Scan using session_lim_values_idx on session  (cost=0.44..17538.94 rows=6868 width=32) (actual time=0.074..73.266 rows=7869 loops=1)
         Index Cond: ((date(associationtime) >= '2014-11-04'::date) AND (date(associationtime) <= '2014-11-20'::date))
         Filter: ((ssid_id = 42) AND (ap_id = 1731))
         Rows Removed by Filter: 297425
 Total runtime: 78.932 ms

Look at this line:

Index Scan using session_lim_values_idx

As you can see, query uses three fields to scan: ssid_id, ap_id and associationtime. I've got an index fo this:

ssid_pkey                  | btree | {id}
ap_pkey                    | btree | {id}
testingshit_pkey           | btree | {one,two,three}
session_date_ssid_idx      | btree | {ssid_id,date(associationtime),"date_trunc('hour'::text, associationtime)"}
session_pkey               | btree | {associationtime,disassociationtime,sessionduration,clientip,clientmac,devicename,tx,rx,protocol,snr,rssi,dataretries,rtsretries }
session_main_idx           | btree | {ssid_id,ap_id,associationtime,disassociationtime,sessionduration,clientip,clientmac,devicename,tx,rx,protocol,snr,rssi,dataretres,rtsretries}
session_date_idx           | btree | {date(associationtime),"date_trunc('hour'::text, associationtime)"}
session_date_apid_idx      | btree | {ap_id,date(associationtime),"date_trunc('hour'::text, associationtime)"}
session_date_ssid_apid_idx | btree | {ssid_id,ap_id,date(associationtime),"date_trunc('hour'::text, associationtime)"}
ap_apname_idx              | btree | {apname}
users_pkey                 | btree | {username}
user_roles_pkey            | btree | {user_role_id}
session_lim_values_idx     | btree | {date(associationtime)}

It's called session_date_ssid_apid_idx. But why query uses wrong index?

session_date_ssid_apid_idx:

------------+-----------------------------+-------------------------------------------
 ssid_id    | integer                     | ssid_id
 ap_id      | integer                     | ap_id
 date       | date                        | date(associationtime)
 date_trunc | timestamp without time zone | date_trunc('hour'::text, associationtime)

session_lim_values_idx:

date    | date | date(associationtime)

What index would you create?

UPD: \d session

 --------------------+-----------------------------+------------------------------------------------------
 id                 | integer                     | NOT NULL DEFAULT nextval('session_id_seq'::regclass)
 ssid_id            | integer                     | NOT NULL
 ap_id              | integer                     | NOT NULL
 associationtime    | timestamp without time zone | NOT NULL
 disassociationtime | timestamp without time zone | NOT NULL
 sessionduration    | character varying(100)      | NOT NULL
 clientip           | character varying(100)      | NOT NULL
 clientmac          | character varying(100)      | NOT NULL
 devicename         | character varying(100)      | NOT NULL
 tx                 | integer                     | NOT NULL
 rx                 | integer                     | NOT NULL
 protocol           | character varying(100)      | NOT NULL
 snr                | integer                     | NOT NULL
 rssi               | integer                     | NOT NULL
 dataretries        | integer                     | NOT NULL
 rtsretries         | integer                     | NOT NULL
╚эфхъё√:
    "session_pkey" PRIMARY KEY, btree (associationtime, disassociationtime, sessionduration, clientip, clientmac, devicename, tx, rx, protocol, snr, rssi, dataretries, rtsretries)
    "session_date_ap_ssid_idx" btree (ssid_id, ap_id, associationtime)
    "session_date_apid_idx" btree (ap_id, date(associationtime), date_trunc('hour'::text, associationtime))
    "session_date_idx" btree (date(associationtime), date_trunc('hour'::text, associationtime))
    "session_date_ssid_apid_idx" btree (ssid_id, ap_id, associationtime)
    "session_date_ssid_idx" btree (ssid_id, date(associationtime), date_trunc('hour'::text, associationtime))
    "session_lim_values_idx" btree (date(associationtime))
    "session_main_idx" btree (ssid_id, ap_id, associationtime, disassociationtime, sessionduration, clientip, clientmac, devicename, tx, rx, protocol, snr, rssi, dataretries, rtsretries)
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Tony
  • 3,605
  • 14
  • 52
  • 84
  • 1
    Is that really the complete output of the execution plan? I would expect at least another step in it that does the lookup of the other columns. Btw: you can remove one of the indexes: `ssid_pkey` or `ap_pkey` they are identical. It's also better to show the list of indexes from the output of psql's `\d` command rather than the (somehow confusing) content of the system catalog (or at least use the view `pg_indexes`) –  Nov 20 '14 at 10:51
  • 1
    From what I see so far, the index `session_date_ssid_apid_idx` should get used. Either there is something missing in your question or there is something wrong with your DB. I would drop that index (or all of them), run `VACUUM FULL ANALYZE session`, recreate the index (or all of them) and try again. Or use [pg_repack](https://github.com/reorg/pg_repack) if you can't afford to lock the table. **Or** most of your columns have `ssid_id=42 AND ap_id=1731`, so that these predicates are insignificant for the selection of the index and it's cheapter to use the smaller index and filter the rest. – Erwin Brandstetter Nov 20 '14 at 11:05
  • @ErwinBrandstetter, seem like you are right about `ssid_id=42 AND ap_id=1731`. If I change these values to less popular, new index(right index) will be selected. – Tony Nov 20 '14 at 11:42
  • What do you get for `SELECT count(*) AS a, count(ssid_id=42 AND ap_id=1731 OR NULL) AS b FROM session`? – Erwin Brandstetter Nov 20 '14 at 11:44
  • And for `SELECT count(associationtime BETWEEN '2014-11-04 0:0' AND '2014-11-20 0:0' OR NULL) AS a, count(associationtime BETWEEN '2014-11-04 0:0' AND '2014-11-20 0:0' AND ssid_id=42 AND ap_id=1731 OR NULL) AS b FROM session`? – Erwin Brandstetter Nov 20 '14 at 11:49
  • Two more: For `SHOW default_statistics_target;` and `SELECT attname, attstattarget FROM pg_attribute WHERE attrelid = 'session'::regclass AND attnum > 0;` Best put the data in your question. – Erwin Brandstetter Nov 20 '14 at 11:57
  • First query: 100. Second in the topic – Tony Nov 20 '14 at 11:59
  • OK, I am working on an answer. Can you remove the results from the statistics query (those are the expected defaults) and instead add the table definition (what you get with `\d session` in `psql`)? – Erwin Brandstetter Nov 20 '14 at 12:16

1 Answers1

10

Very common values in the predicates for ssid_id and ap_id can make it cheaper for Postgres to pick the smaller index session_lim_values_idx (only 1 date column) over the seemingly better fit, but bigger index session_date_ssid_apid_idx (4 columns) and filter the rest.

In your case around 4 % of the rows have ssid_id=42 AND ap_id=1731. That shouldn't normally warrant the switch to the smaller index. But several other factors are in play that may tilt the scale, basically cost settings and statistics. Details:

What to do?

  • Adjust your cost settings if you did not do so already as advised in linked the answer above.

  • Increase the statistics target for involved columns ssid_id, ap_id and run ANALYZE:

    One special factor here: Postgres collects separate statistics for expressions in indexes. Check with:

    SELECT * FROM pg_statistic
    WHERE starelid = 'session_date_ssid_apid_idx'::regclass;
    

    You'll find a dedicated row for the expression date(associationtime). More details:

  • Make the index session_date_ssid_apid_idx more attractive (smaller) by removing the 4th column "date_trunc('hour'::text, associationtime). Looking at your later added table definition, you already did that.

  • I would rather use the standard syntax for casts: cast(associationtime AS date) instead of the function syntax date(associationtime). Not saying that matters at all, I just know the standard way to work properly. You can use the shorthand syntax associationtime::date in your queries, that's compatible with the expression index, but use the verbose form in the index definition.

Also, test with EXPLAIN ANALYZE which query plan is actually faster by removing / recreating only the index you want to test. Then you'll see whether Postgres picked the best plan after all.

You have quite a number of indexes, I would check whether all of them are actually used and get rid of the rest. Indexes have maintenance cost and it's generally beneficial to concentrate on fewer indexes if possible (fit in cache more easily and may be cached already when needed). Weigh cost vs. benefit.

Aside

I'd use:

SUM(extract(epoch FROM disassociationtime
                     - associationtime)::int) AS sessionduration
Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • This is frustrating that a database in 2020 chooses completely wrong indexes and plans badly... We're giving MySQL a second thought after all this but need geo spatial support. I've set random page cost to 1.2 yet it chooses the wrong index for a simple query, depending on how long a xyz_id in field list is. The the longer values make postgres choose the right index which shouldn't be. – Kevin Parker Mar 12 '20 at 21:10
  • We had to disable sort system wide for it to choose the right index, otherwise it chooses a single column index and chooses to sort that to enormous time and cost. No matter how many times we vacuum analyze full and even set default_statistics_target=10000 (supposedly the most accurate setting) it does not work. PG11. – Kevin Parker Mar 12 '20 at 22:55