3

I have the following scenario in a pg major 11:

DROP TABLE IF EXISTS public.agent_sessions_partitioned;

CREATE TABLE public.agent_sessions_partitioned
            (
                id uuid NOT NULL DEFAULT uuid_generate_v4(),
                account_id uuid,
                user_id uuid
            ) PARTITION BY LIST (account_id);

CREATE TABLE "agent_sessions_bcbc5acc-f020-4073-bdf4-3098bc043e8b"
                  PARTITION OF agent_sessions_partitioned
                  FOR VALUES IN ('bcbc5acc-f020-4073-bdf4-3098bc043e8b'); 

INSERT INTO agent_sessions_partitioned (id, account_id, user_id)
    SELECT agent_sessions.id, account_id, user_id FROM agent_sessions; 

ALTER TABLE "agent_sessions_bcbc5acc-f020-4073-bdf4-3098bc043e8b" ADD PRIMARY KEY (id); 

And so on.

This works very well when I have any query like this:

Select * from agent_sessions_partitioned where account_id = 'XX'

But because I'm using an ORM (Rails - Active record) I don't have the option to always use the account_id in the statement and whenever I need to do something like:

UPDATE agent_sessions_partitioned set user_id = 'x' where id = 'y'

PG scans all child tables trying to find this tuple, see the explain analyse below:

"Append  (cost=0.28..2612.12 rows=355 width=558) (actual time=0.956..277.658 rows=1 
loops=1)"
"  ->  Index Scan using "agent_sessions_a13f3c88-3022-4676-bd48-6580d8877ae2_pkey" on 
"agent_sessions_a13f3c88-3022-4676-bd48-6580d8877ae2"  (cost=0.28..8.30 rows=1 width=500)             
(actual time=0.955..0.956 rows=1 loops=1)"
"        Index Cond: (id = 'b21a0178-f97c-4598-ba39-bf763ba377b5'::uuid)"
"  ->  Index Scan using "agent_sessions_325774d6-e5e7-4fae-9659-8b76349a6c2a_pkey" on 
"agent_sessions_325774d6-e5e7-4fae-9659-8b76349a6c2a"  (cost=0.29..8.30 rows=1 width=481) 
(actual time=0.750..0.750 rows=0 loops=1)"
"        Index Cond: (id = 'b21a0178-f97c-4598-ba39-bf763ba377b5'::uuid)"
"  ->  Index Scan using "agent_sessions_1f781bcd-b941-4915-949a-9af893d8f066_pkey" on 
"agent_sessions_1f781bcd-b941-4915-949a-9af893d8f066"  (cost=0.29..8.30 rows=1 width=507) 
 (actual time=1.523..1.523 rows=0 loops=1)"

As I don't have the option to change this query that updates records by id, is there anything I can do in the postgres side? any configuration or another type of partition or even a version upgrade to pg 12/13 that might help me?

Moondustt
  • 864
  • 1
  • 11
  • 30
  • 1
    May I ask why you're using partitioning? Is your table extremely large? You may be better off with a regular table that has an index on "id" and an index on "account_id". – Blue Star Sep 29 '20 at 04:52
  • You should always access data acording to partition rule, or the postgres can not determine where data locate. – spike 王建 Sep 29 '20 at 05:39
  • Yes, my table is getting really large, and one of the strategies I found was partitioning the table. For instance, I have a table with 180m. records. – Moondustt Sep 29 '20 at 14:05

1 Answers1

0

No, there is no way to avoid this increased planning and execution cost unless you partition the table differently.

Contrary to common belief, partitioning a table slows down most SQL statements that use the table. Only in rare cases, when you can restrict a sequential scan to a subset of the partitions, you will see a performance increase. Note that an index scan on a big table is not noticeably slower than an index scan on a small table.

Rather, you use partitioning to gain speed and flexibility in adding and removing many table rows in bulk operations (in PostgreSQL, it also helps with autovacuum on large tables). The performance impact on queries is a price you pay for that.

Unless you have too many partitions (beware!), the impact shouldn't be too bad. You can use prepared statements to reduce the planning time for queries like the one in your question.

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263