Table Partitioning vs non-Partitioned Table with many indexes

Question

I have a DB "DB_One" with a Master Table called t_d_gate_out with 8 indexes on it. I created another DB with partitioned t_d_gate_out (let's call it "DB_Two"). It was partitioned by month and year (example of child table: t_d_gate_out09-2013) which has two indexes (d_gate_out and new column on each child: i_trx_own)

This is my function of creating and inserting child tables:

CREATE OR REPLACE FUNCTION ctm_test.gateout_partition_function()
  RETURNS trigger AS
$BODY$ 
DECLARE new_time text;
tablename text;
seqname text;
seqname_schema text;
bulantahun text;
bulan text;
bulan2 text;
tahun text;
enddate text;
result record;

BEGIN new_time := to_char(NEW.d_gate_out,'MM-YYYY');
bulan:=to_char(NEW.d_gate_out,'MM');
bulan2:=extract(month from NEW.d_gate_out);
tahun:=to_char(NEW.d_gate_out,'YYYY');
bulantahun := new_time;
tablename := 't_d_gate_out'||bulantahun;
seqname := 't_d_gate_out'||bulantahun||'_seq';
seqname_schema := 'ctm_test.t_d_gate_out'||bulantahun||'_seq';

PERFORM 1 FROM pg_catalog.pg_class c JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace WHERE 

c.relkind = 'r' AND c.relname = tablename AND n.nspname = 'ctm_test';

IF NOT FOUND THEN  EXECUTE 'CREATE TABLE ctm_test.' || quote_ident(tablename) || ' ( i_trx_own 

serial PRIMARY KEY, CHECK (extract(month from d_gate_out)=' || bulan2 || ' AND extract(year from 

d_gate_out)=' || tahun || ')) INHERITS (ctm_test.t_d_gate_out)';

EXECUTE 'ALTER TABLE ctm_test.' || quote_ident(tablename) || ' OWNER TO postgres'; EXECUTE 'GRANT 

ALL ON TABLE ctm_test.' || quote_ident(tablename) || ' TO postgres';

 EXECUTE 'CREATE INDEX ' || quote_ident(tablename||'_indx1') || ' ON ctm_test.' || quote_ident

(tablename) || ' (i_trx_own);CREATE INDEX ' || quote_ident(tablename||'_indx2') || ' ON ctm_test.' || quote_ident

(tablename) || ' (d_gate_out)'; END IF;

EXECUTE 'INSERT INTO ctm_test.' || quote_ident(tablename) || ' VALUES ($1.*)' USING NEW; RETURN 

NULL; END; $BODY$
  LANGUAGE plpgsql VOLATILE
  COST 100;
ALTER FUNCTION ctm_test.gateout_partition_function()
  OWNER TO postgres;

And this is my TRIGGER:

CREATE TRIGGER gateout_master_trigger
  BEFORE INSERT
  ON ctm_test.t_d_gate_out
  FOR EACH ROW
  EXECUTE PROCEDURE ctm_test.gateout_partition_function();

After inserting about 200k rows, I was trying to compare the speed of viewing data between those 2 DBs. The query I used to compare:

select * from ctm_test."t_d_gate_out"
where d_gate_out BETWEEN '2013-10-01' AND '2013-10-31'

The result after I tried to execute that query for several times, the execution times were ALMOST THE SAME. So the partitioned table was not effective to view data faster.
Was I doing the partition correctly?

200k isn't that much, we do about 20 million per day in a one month partition, resulting in 450 to 500 million records per table partition. — Frank Heikens, Jan 12 '15 at 06:34
Thanks sir. I don't really understand about this partition. Can I get the child's data by SELECT-ing a query on MASTER TABLE. I mean like this, when I execute : Select * from master_table where date BETWEEN '2013-10-01' AND '2013-11-31'. Are the data picked from master table, or from the child tables? I'm still confused about the work of this partitioning. I just know that this is one way to make the database become lighter... :( And if I want a partition by month and year, Did I create my "CHECK" correctly?? thanks Sir:) – — Septa Ringga Daniarta, Jan 12 '15 at 08:39
Since you are partitioning with inheritance (`INHERITS (ctm_test.t_d_gate_out)`, data from all child tables is included in a `SELECT` query on the master table. [Read the manual](http://www.postgresql.org/docs/current/interactive/ddl-inherit.html). Use `SELECT ... FROM ONLY tbl` to exclude children: http://stackoverflow.com/questions/27881745/errorkey-is-not-present-in-table/27881996#27881996. **CHECK** constraint: While correct, it is **very inefficient**. Please present your code in a **readable format** first. — Erwin Brandstetter, Jan 12 '15 at 10:20
my "Check" will work like this : when I'm inserting a row with d_gate_out = '2013-10-01' then the check will be : CHECK (extract(month from d_gate_out)='10' AND extract(year from d_gate_out)='2013') Did I write my Syntax correctly??? :) — Septa Ringga Daniarta, Jan 14 '15 at 02:00

score 3 · Answer 1 · edited Jan 23 '16 at 08:35

3

200K rows isn't very many.

Partitioning mainly helps when your table is larger than RAM. and most of your queries can be served from a single table.

edited Jan 23 '16 at 08:35

marc_s

732,580
175
1,330
1,459

answered Jan 12 '15 at 05:06

Jasen

11,837
2
30
48

score 2 · Answer 2 · edited May 23 '17 at 12:34

As long as each of your queries can use an index, they will perform almost the same. (Actually, you should see a sequential scan on the partitioned table, since you read all rows from it.)
Access via index is fast (I'd expect bitmap index scans). The index on the monolithic table is (much) bigger (and also needs an additional leading column), which may challenge your RAM size and will cost more for the first run. But as soon as the index resides in RAM and you have enough of it, you won't notice much on further calls. There are other situations where you might profit more from partitioning.

Or worse: queries that involve multiple partitions tend to be slower than equivalent queries on a single big table. It's cheaper to access a single table. Typically not the case for a query like you demonstrate: on a small time range, only spanning one (or few) partitions. The worst case would be queries that read rows from the whole range, randomly spread out.

If your table is huge and queries are mostly on a few partitions, you may start to see a benefit. The index for a small partition is much smaller and much more readily fits into RAM (and stays there). Plenty of RAM is the key ingredient for performance (in addition to matching indexes).

The number of indexes is almost completely irrelevant for read performance. The general guideline is to create as few indexes as possible, but as many as needed. If in doubt, favor simple indexes that can serve more kinds of queries over highly specialized indexes, tailored for a single use case (unless that case is particularly important). Any unused index is just a burden on write performance and wastes space.

If you are mostly concerned with read performance, partial indexes on a single big table might be an alternative.

Aside: your trigger function could be improved in multiple places. Start with a more readable format and this hint:

PL/pgSQL checking if a row exists - SELECT INTO boolean

Table Partitioning vs non-Partitioned Table with many indexes

2 Answers2