0

I have a PostgreSQL 10.6 database on Amazon RDS. My table is like this:

CREATE TABLE dfo_by_quarter (
    release_key int4 NOT NULL,
    country varchar(100) NOT NULL,
    product_group varchar(100) NOT NULL,
    distribution_type varchar(100) NOT NULL,
    "year" int2 NOT NULL,
    "date" date NULL,
    quarter int2 NOT NULL,
    category varchar(100) NOT NULL,
    units numeric(38,6) NOT NULL,
    sales_value_eur numeric(38,6) NOT NULL,
    sales_value_usd numeric(38,6) NOT NULL,
    sales_value_local numeric(38,6) NOT NULL,
    data_status bpchar(1) NOT NULL,
    panel_market_units numeric(38,6) NOT NULL,
    panel_market_sales_value_eur numeric(38,6) NOT NULL,
    panel_market_sales_value_usd numeric(38,6) NOT NULL,
    panel_market_sales_value_local numeric(38,6) NOT NULL,
    CONSTRAINT pk_dpretailer_dfo_by_quarter PRIMARY KEY (release_key, country, category, product_group, distribution_type, year, quarter),
    CONSTRAINT fk_dpretailer_dfo_by_quarter_release FOREIGN KEY (release_key) REFERENCES dpretailer.dfo_release(release_id)
);

I understand Primary Key implies a unique index

If I simply ask how many rows I have when filtering on non existing data (release_key = 1 returns nothing), I can see it uses the index

EXPLAIN
SELECT COUNT(*)
  FROM dpretailer.dfo_by_quarter
  WHERE release_key = 1

Aggregate  (cost=6.32..6.33 rows=1 width=8)
  ->  Index Only Scan using pk_dpretailer_dfo_by_quarter on dfo_by_quarter  (cost=0.55..6.32 rows=1 width=0)
        Index Cond: (release_key = 1)

But if I run the same query on a value that returns data, it scans the table, which is bound to be more expensive...

EXPLAIN
SELECT COUNT(*)
  FROM dpretailer.dfo_by_quarter
  WHERE release_key = 2

Finalize Aggregate  (cost=47611.07..47611.08 rows=1 width=8)
  ->  Gather  (cost=47610.86..47611.07 rows=2 width=8)
        Workers Planned: 2
        ->  Partial Aggregate  (cost=46610.86..46610.87 rows=1 width=8)
              ->  Parallel Seq Scan on dfo_by_quarter  (cost=0.00..46307.29 rows=121428 width=0)
                    Filter: (release_key = 2)

I get it that using the index when there is no data makes sense and is driven by the stats on the table (I ran ANALYSE before the tests)

But why not using my index if there is data?

Surely, it must be quicker to scan part of an index (because release_key is the first column) rather than scanning an entire table???

I must be missing something...?

Update 2019-03-07

Thank You for your comments, which are very useful.

This simple query was just me trying to understand why the index was not used...

But I should have known better (I am new to postgresql but have MANY years experience with SQL Server) and it makes sense that it is not, as you commented about.

  • bad selectivity because my criteria only filters about 20% of the rows
  • bad table design (too fat, which we knew and are now addressing)
  • index not "covering" the query, etc...

So let me change "slightly" my question if I may...

Our table will be normalised in facts/dimensions (no more varchars in the wrong place).

We do only inserts, never updates and so few deletes that we can ignore it.

The table size will not be huge (tens of million of rows order).

Our queries will ALWAYS specify an exact release_key value.

Our new version of the table would look like this

CREATE TABLE dfo_by_quarter (
    release_key int4 NOT NULL,
    country_key int2 NOT NULL,
    product_group_key int2 NOT NULL,
    distribution_type_key int2 NOT NULL,
    category_key int2 NOT NULL,
    "year" int2 NOT NULL,
    "date" date NULL,
    quarter int2 NOT NULL,
    units numeric(38,6) NOT NULL,
    sales_value_eur numeric(38,6) NOT NULL,
    sales_value_usd numeric(38,6) NOT NULL,
    sales_value_local numeric(38,6) NOT NULL,
    CONSTRAINT pk_milly_dfo_by_quarter PRIMARY KEY (release_key, country_key, category_key, product_group_key, distribution_type_key, year, quarter),
    CONSTRAINT fk_milly_dfo_by_quarter_release FOREIGN KEY (release_key) REFERENCES dpretailer.dfo_release(release_id),
    CONSTRAINT fk_milly_dim_dfo_category FOREIGN KEY (category_key) REFERENCES milly.dim_dfo_category(category_key),
    CONSTRAINT fk_milly_dim_dfo_country FOREIGN KEY (country_key) REFERENCES milly.dim_dfo_country(country_key),
    CONSTRAINT fk_milly_dim_dfo_distribution_type FOREIGN KEY (distribution_type_key) REFERENCES milly.dim_dfo_distribution_type(distribution_type_key),
    CONSTRAINT fk_milly_dim_dfo_product_group FOREIGN KEY (product_group_key) REFERENCES milly.dim_dfo_product_group(product_group_key)
);

With that in mind, in a SQL Server environment, I could solve this by having a "Clustered" primary key (the entire table being sorted), or having an index on the primary key with INCLUDE option for the other columns required to cover the queries (Units, Values, etc).

Question 1)

In postgresql, is there an equivalent to the SQL Server Clustered index? A way to actually sort the entire table? I suppose it might be difficult because postgresql does not do updates "in place", hence it might make sorting expensive...

Or, is there a way to create something like a SQL Server Index WITH INCLUDE(units, values)?

update: I came across the SQL CLUSTER command, which is the closest thing I suppose. It would be suitable for us

Question 2

With the query below

EXPLAIN (ANALYZE, BUFFERS)
WITH "rank_query" AS
(
  SELECT
    ROW_NUMBER() OVER(PARTITION BY "year" ORDER BY SUM("main"."units") DESC) AS "rank_by",
    "year",
    "main"."product_group_key" AS "productgroupkey",
    SUM("main"."units") AS "salesunits",
    SUM("main"."sales_value_eur") AS "salesvalue",
    SUM("sales_value_eur")/SUM("units") AS "asp"
  FROM "milly"."dfo_by_quarter" AS "main"

  WHERE
    "release_key" = 17 AND
    "main"."year" >= 2010
  GROUP BY
    "year",
    "main"."product_group_key"
)
,BeforeLookup
AS (
SELECT
  "year" AS date,
  SUM("salesunits") AS "salesunits",
  SUM("salesvalue") AS "salesvalue",
  SUM("salesvalue")/SUM("salesunits") AS "asp",
  CASE WHEN "rank_by" <= 50 THEN "productgroupkey" ELSE -1 END AS "productgroupkey"
FROM
  "rank_query"
GROUP BY
  "year",
  CASE WHEN "rank_by" <= 50 THEN "productgroupkey" ELSE -1 END
)
SELECT BL.date, BL.salesunits, BL.salesvalue, BL.asp
  FROM BeforeLookup AS BL
  INNER JOIN milly.dim_dfo_product_group PG ON PG.product_group_key = BL.productgroupkey;

I get this

Hash Join  (cost=40883.82..40896.46 rows=558 width=98) (actual time=676.565..678.308 rows=663 loops=1)
  Hash Cond: (bl.productgroupkey = pg.product_group_key)
  Buffers: shared hit=483 read=22719
  CTE rank_query
    ->  WindowAgg  (cost=40507.15..40632.63 rows=5577 width=108) (actual time=660.076..668.272 rows=5418 loops=1)
          Buffers: shared hit=480 read=22719
          ->  Sort  (cost=40507.15..40521.09 rows=5577 width=68) (actual time=660.062..661.226 rows=5418 loops=1)
                Sort Key: main.year, (sum(main.units)) DESC
                Sort Method: quicksort  Memory: 616kB
                Buffers: shared hit=480 read=22719
                ->  Finalize HashAggregate  (cost=40076.46..40160.11 rows=5577 width=68) (actual time=648.762..653.227 rows=5418 loops=1)
                      Group Key: main.year, main.product_group_key
                      Buffers: shared hit=480 read=22719
                      ->  Gather  (cost=38710.09..39909.15 rows=11154 width=68) (actual time=597.878..622.379 rows=11938 loops=1)
                            Workers Planned: 2
                            Workers Launched: 2
                            Buffers: shared hit=480 read=22719
                            ->  Partial HashAggregate  (cost=37710.09..37793.75 rows=5577 width=68) (actual time=594.044..600.494 rows=3979 loops=3)
                                  Group Key: main.year, main.product_group_key
                                  Buffers: shared hit=480 read=22719
                                  ->  Parallel Seq Scan on dfo_by_quarter main  (cost=0.00..36019.74 rows=169035 width=22) (actual time=106.916..357.071 rows=137171 loops=3)
                                        Filter: ((year >= 2010) AND (release_key = 17))
                                        Rows Removed by Filter: 546602
                                        Buffers: shared hit=480 read=22719
  CTE beforelookup
    ->  HashAggregate  (cost=223.08..238.43 rows=558 width=102) (actual time=676.293..677.167 rows=663 loops=1)
          Group Key: rank_query.year, CASE WHEN (rank_query.rank_by <= 50) THEN (rank_query.productgroupkey)::integer ELSE '-1'::integer END
          Buffers: shared hit=480 read=22719
          ->  CTE Scan on rank_query  (cost=0.00..139.43 rows=5577 width=70) (actual time=660.079..672.978 rows=5418 loops=1)
                Buffers: shared hit=480 read=22719
  ->  CTE Scan on beforelookup bl  (cost=0.00..11.16 rows=558 width=102) (actual time=676.296..677.665 rows=663 loops=1)
        Buffers: shared hit=480 read=22719
  ->  Hash  (cost=7.34..7.34 rows=434 width=4) (actual time=0.253..0.253 rows=435 loops=1)
        Buckets: 1024  Batches: 1  Memory Usage: 24kB
        Buffers: shared hit=3
        ->  Seq Scan on dim_dfo_product_group pg  (cost=0.00..7.34 rows=434 width=4) (actual time=0.017..0.121 rows=435 loops=1)
              Buffers: shared hit=3
Planning time: 0.319 ms
Execution time: 678.714 ms

Does anything spring to mind?

If I read it properly, it means my biggest cost by far is the initial scanof the table... but I don't manage to make it use an index...

I had created an index I hoped would help but it got ignored...

CREATE INDEX eric_silly_index ON milly.dfo_by_quarter(release_key, YEAR, date, product_group_key, units, sales_value_eur);

ANALYZE milly.dfo_by_quarter;

I also tried to cluster the table but no visible effect either

CLUSTER milly.dfo_by_quarter USING pk_milly_dfo_by_quarter; -- took 30 seconds (uidev)

ANALYZE milly.dfo_by_quarter;

Many thanks

Eric

Eric Mamet
  • 2,681
  • 2
  • 13
  • 43
  • How many rows in your table? How many with `release_key = 2`. Is there (concurrent) write load on the table? Do you see an index-only scan after `VACUUM dpretailer.dfo_by_quarter`? – Erwin Brandstetter Mar 06 '19 at 18:44
  • Also, can you replace the `EXPLAIN`s with output from `EXPLAIN (ANALYZE, BUFFERS)`? That would give us concrete timings and shared cache hits/misses. – Ancoron Mar 06 '19 at 18:48

3 Answers3

1

Generally, while possible, a PK spanning 7 columns, several of which being varchar(100) is not optimized for performance, to say the least.

Such an index is large to begin with and tends to bloat quickly, if you have updates on involved columns.

I would operate with a surrogate PK, a serial (or bigserial if you have that many rows). Or IDENTITY. See:

And a UNIQUE constraint on all 7 to enforce uniqueness (all are NOT NULL anyway).

If you have lots of counting queries with the only predicate on release_key consider an additional plain btree index on just that column.

The data type varchar(100) for so many columns may not be optimal. Some normalization might help.

More advise depends on missing information ...

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Voting +1 for this answer since adding an index on the filter column increases the chance of getting an index-only scan. Once that is in-place (remember that creating a foreign key does not create and index), the primary key should be irrelevant. – Ancoron Mar 06 '19 at 19:08
  • * I ran this particular query purely as an exercise to try and get an index scan. The original query was far more complex. * In "real scenarios", I need the other columns. * There was no other operation concurrently * There is probably about 300,000 rows for this release_key value and a few millions overall * At this stage, my table is "fat" (varchar columns) because it was a quick and dirty implementation but I'll make it look more like fact/dimensions very soon (therefore surrogate keys instead of varchars) – Eric Mamet Mar 06 '19 at 22:23
  • @EricMamet: Then the question might be misleading. All details matter. Postgres decides the query plan based on estimated costs. "A few millions" is too fuzzy. Could be 2 or 9 millions, which makes a big difference. – Erwin Brandstetter Mar 06 '19 at 22:28
  • @ErwinBrandstetter I understand that and I'll come back with more details tomorrow. In SQL Server, a request like this would always use the index because it would not need to look at the table at all and would only sequentially scan a subset of the index (regardless of the number of rows). I was wondering whether postgresql would behave the same. Obviously not! – Eric Mamet Mar 06 '19 at 22:43
  • @EricMamet: It depends. For [index-only scans](https://www.postgresql.org/docs/current/static/indexes-index-only-scans.html) like we see in your first query plan, some preconditions have to be met. – Erwin Brandstetter Mar 07 '19 at 00:13
1

Because release_key isn't actually a unique column, it's not possible from the information you've provided to know whether or not the index should be used. If a high percentage of rows have release_key = 2 or even a smaller percentage of rows match on a large table, it may not be efficient to use the index.

In part this is because Postgres indexes are indirect -- that is the index actually contains a pointer to the location on disk in the heap where the real tuple lives. So looping through an index requires reading an entry from the index, reading the tuple from the heap, and repeating. For a large number of tuples it's often more valuable to scan the heap directly and avoid the indirect disk access penalty.

Edit: You generally don't want to be using CLUSTER in PostgreSQL; it's not how indexes are maintained, and it's rare to see that in the wild for that reason.

Your updated query with no data gives this plan:

                                                                                  QUERY PLAN                                                                                  
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 CTE Scan on beforelookup bl  (cost=8.33..8.35 rows=1 width=98) (actual time=0.143..0.143 rows=0 loops=1)
   Buffers: shared hit=4
   CTE rank_query
     ->  WindowAgg  (cost=8.24..8.26 rows=1 width=108) (actual time=0.126..0.126 rows=0 loops=1)
           Buffers: shared hit=4
           ->  Sort  (cost=8.24..8.24 rows=1 width=68) (actual time=0.060..0.061 rows=0 loops=1)
                 Sort Key: main.year, (sum(main.units)) DESC
                 Sort Method: quicksort  Memory: 25kB
                 Buffers: shared hit=4
                 ->  GroupAggregate  (cost=8.19..8.23 rows=1 width=68) (actual time=0.011..0.011 rows=0 loops=1)
                       Group Key: main.year, main.product_group_key
                       Buffers: shared hit=1
                       ->  Sort  (cost=8.19..8.19 rows=1 width=64) (actual time=0.011..0.011 rows=0 loops=1)
                             Sort Key: main.year, main.product_group_key
                             Sort Method: quicksort  Memory: 25kB
                             Buffers: shared hit=1
                             ->  Index Scan using pk_milly_dfo_by_quarter on dfo_by_quarter main  (cost=0.15..8.18 rows=1 width=64) (actual time=0.003..0.003 rows=0 loops=1)
                                   Index Cond: ((release_key = 17) AND (year >= 2010))
                                   Buffers: shared hit=1
   CTE beforelookup
     ->  HashAggregate  (cost=0.04..0.07 rows=1 width=102) (actual time=0.128..0.128 rows=0 loops=1)
           Group Key: rank_query.year, CASE WHEN (rank_query.rank_by <= 50) THEN (rank_query.productgroupkey)::integer ELSE '-1'::integer END
           Buffers: shared hit=4
           ->  CTE Scan on rank_query  (cost=0.00..0.03 rows=1 width=70) (actual time=0.127..0.127 rows=0 loops=1)
                 Buffers: shared hit=4
 Planning Time: 0.723 ms
 Execution Time: 0.485 ms
(27 rows)

So PostgreSQL is entirely capable of using the index for your query, but the planner is deciding that it's not worth it (i.e., the costing for using the index directly is higher than the costing for using the parallel sequence scan).

If you set enable_indexscan = off; with no data, you get a bitmap index scan (as I'd expect). If you set enable_bitmapscan = off; with no data you get an (non-parallel) sequence scan.

You should see the plan change back (with large amounts of data) if you set max_parallel_workers = 0;.

But looking at your query's explain results, I'd very much expect using the index to be more expensive and take longer than using the parallel sequence scan. In your updated query you're still scanning a very high percentage of the table and a large number of rows, and you're also forcing accessing the heap by accessing fields not in the index. Postgres 11 (I believe) adds covering indexes which would theoretically allow you to make this query be driven by the index alone, but I'm not at all convinced in this example it would actually be worth it.

jcoleman
  • 503
  • 3
  • 14
  • I don't have access to the database right now so I won't be able to supply more details this evening. Coming from a SQL Server background, I understand the "indirect" aspect of the index and the same would apply to SQL Server if I referred to any column not covered by the index. But in my example, I made a point to do just a COUNT(*) that, in SQL Server, would be resolved without accessing the underlying table since the row count could be worked out from the index. – Eric Mamet Mar 06 '19 at 22:31
  • @EricMamet I updated my answer for your updated question. – jcoleman Mar 08 '19 at 13:03
  • Much appreciated. Makes sense – Eric Mamet Mar 14 '19 at 17:34
0

The answer to my initial question: why is postgresql not using my index on something like SELECT (*)... can be found in the documentation...

Introduction to VACUUM, ANALYZE, EXPLAIN, and COUNT

In particular: This means that every time a row is read from an index, the engine has to also read the actual row in the table to ensure that the row hasn't been deleted.

This explains a lot why I don't manage to get postgresql to use my indexes when, from a SQL Server perspective, it obviously "should".

Eric Mamet
  • 2,681
  • 2
  • 13
  • 43