I need help with the deleting of a lot of specific rows by ID(around 10k IDs) from big table having partitions by year.
I have run the following query to analyze execution plan with only 2 IDs:
explain analyze delete from scheme.users_daily
where user_id in ( -- user_id is not PK, can be duplicated
5791001,
7779001
)
Actually, there should be around 10k IDs that I want to delete. I guess it`s around 1-3% of all data in the table.
And receive the following result:
"Delete on users_daily (cost=0.00..8573700.00 rows=3627 width=6) (actual time=10271.578..10271.578 rows=0 loops=1)"
" -> Seq Scan on users_daily (cost=0.00..0.00 rows=1 width=6) (actual time=0.001..0.001 rows=0 loops=1)"
" Filter: (user_id= ANY ('{5791001,7779001}'::bigint[]))"
" -> Index Scan using users_daily_y2013_unq on users_daily_y2013 (cost=0.57..2351216.55 rows=1556 width=6) (actual time=2792.469..2792.469 rows=0 loops=1)"
" Index Cond: (user_id= ANY ('{5791001,7779001}'::bigint[]))"
" -> Index Scan using users_daily_y2014_unq on users_daily_y2014 (cost=0.56..1745517.74 rows=694 width=6) (actual time=2158.849..2158.849 rows=0 loops=1)"
" Index Cond: (user_id = ANY ('{5791001,7779001}'::bigint[]))"
" -> Index Scan using users_daily_y2015hy1_unq on users_daily_y2015hy1 (cost=0.56..1039034.28 rows=349 width=6) (actual time=1224.623..1224.623 rows=0 loops=1)"
" Index Cond: (user_id= ANY ('{5791001,7779001}'::bigint[]))"
" -> Index Scan using users_daily_y2015hy2_unq on users_daily_y2015hy2 (cost=0.56..1159715.43 rows=375 width=6) (actual time=1380.513..1380.513 rows=0 loops=1)"
" Index Cond: (user_id= ANY ('{5791001,7779001}'::bigint[]))"
" -> Index Scan using users_daily_y2016_unq on users_daily_y2016 (cost=0.57..2278216.00 rows=652 width=6) (actual time=2715.106..2715.106 rows=0 loops=1)"
" Index Cond: (user_id= ANY ('{5791001,7779001}'::bigint[]))"
"Planning time: 0.364 ms"
Then I check how much rows are in current table and partitions:
explain analyze select count(id)
from scheme.users_daily
Response:
"Aggregate (cost=9735664.99..9735665.00 rows=1 width=8) (actual time=1005691.857..1005691.858 rows=1 loops=1)"
" -> Append (cost=0.00..9032279.19 rows=281354320 width=8) (actual time=1.967..797587.379 rows=268084881 loops=1)"
" -> Seq Scan on users_daily (cost=0.00..0.00 rows=1 width=8) (actual time=0.003..0.003 rows=0 loops=1)"
" -> Seq Scan on users_daily_y2013 (cost=0.00..2489084.32 rows=77247432 width=8) (actual time=1.957..134821.451 rows=71962946 loops=1)"
" -> Seq Scan on users_daily_y2014 (cost=0.00..1848731.32 rows=57374432 width=8) (actual time=1.256..72915.919 rows=54835860 loops=1)"
" -> Seq Scan on users_daily_y2015hy1 (cost=0.00..1094813.48 rows=34156048 width=8) (actual time=0.735..40167.938 rows=32796038 loops=1)"
" -> Seq Scan on users_daily_y2015hy2 (cost=0.00..1219911.76 rows=38124076 width=8) (actual time=0.801..43994.913 rows=36669043 loops=1)"
" -> Seq Scan on users_daily_y2016 (cost=0.00..2379738.31 rows=74452331 width=8) (actual time=0.009..109012.004 rows=71820994 loops=1)"
"Planning time: 0.688 ms"
"Execution time: 1005691.977 ms"
How can I solve this performance issue with deleting a lot of rows from the big table?