1

I have around 1 billion rows and 6 columns (id, col1, col2..col5) in my Postgres database table. I have id as my PRIMARY_KEY. I am using this as a read-only database, and every time I only need to retrieve/read 10000 rows based on a list of 10000 ids. The query takes 40-60 seconds to retrieve 10000 rows, which is very slow. Below is my query:

SELECT * 
FROM table_1b 
WHERE id IN (18792, 8718, 15010, 16671,.....*list of 10000 ids*....., 290224379, 192494270)

EXPLAIN ANALYZE for the query:

Index Scan using table_1b_pkey on table_1b 
(cost=0.57..46197.28 rows=10000 width=239) 
(actual time=13.305..55927.060 rows=10000 loops=1)
Index Cond: (id = ANY ('{18792, 8718, 15010, 16671,............,290224379,192494270}'::bigint[]))
Planning Time: 13.380 ms
Execution Time: 55935.614 ms

I would like to significantly speed up the retrieval of the 10000 rows. I am fairly new to Postgres. I have read other posts and tried a few things including modifying work_mem from 4 MB to 256 MB, but it did not help. The Postgres server is running on a machine with 384 GB RAM.

Appreciate any help in advance.

Edit 1: EXPLAIN ANALYZE for 10 records

Index Scan using table_1b_pkey on table_1b
(cost=0.57..49.95 rows=10 width=239) 
(actual time=17.204..123.489 rows=10 loops=1)
Planning Time: 0.378 ms
Execution Time: 123.540 ms

Edit 2: track_io_timing = ON; EXPLAIN (ANALYZE, BUFFERS)

Index Scan using table_1b_pkey on table_1b 
(cost=0.57..17243.33 rows=10000 width=239) 
(actual time=17.304..55371.511 rows=10000 loops=1)
Buffers: shared hit=29891 read=20108
I/O Timings: read=55032.422
Planning Time: 13.113 ms
Execution Time: 55380.644 ms

Edit 3: Updated Server Configuration

max_connections = 20
shared_buffers = 25GB
effective_cache_size = 75GB
maintenance_work_mem = 2GB
checkpoint_completion_target = 0.7
wal_buffers = 16MB
default_statistics_target = 100
random_page_cost = 1.1
effective_io_concurrency = 200
work_mem = 640MB
min_wal_size = 1GB
max_wal_size = 4GB
Sushant
  • 71
  • 1
  • 9
  • Does this answer your question? [Postgres query optimization (forcing an index scan)](https://stackoverflow.com/questions/14554302/postgres-query-optimization-forcing-an-index-scan) – Charlieface Jan 17 '21 at 13:48
  • I've had succes in the past by replacing the IN() conditions by a JOIN using VALUES(),(),(). Could you also show us the result from explain analyse when you're looking for just 10 records instead of 10000? Your current query takes about 5ms a record, which is imho not very fast – Frank Heikens Jan 17 '21 at 13:52
  • @Charlieface I tried SET enable_seqscan = OFF; as given in the link, but it made no difference – Sushant Jan 17 '21 at 15:22
  • @FrankHeikens I have edited my post to include EXPLAIN ANALYZE for 10 records – Sushant Jan 17 '21 at 15:22
  • 1
    I am surprised this isn't using a Parallel Index Scan. What is your Postgres version? –  Jan 17 '21 at 15:29
  • @a_horse_with_no_name Postgres version is 13.0. Do I need to do/enable anything to get a parallel index scan? – Sushant Jan 17 '21 at 15:36
  • 1
    You are probably limited by the speed of your storage. Turn on track_io_timing, and redo the plans with `EXPLAIN (ANALYZE, BUFFERS)` to be sure. – jjanes Jan 17 '21 at 17:27
  • @jjanes I have posted your suggestion in Edit 2 of my post. I have also updated my server config and posted in Edit 3. – Sushant Jan 17 '21 at 17:45
  • 1
    @a_horse_with_no_name An IN-list will not be broken into chunks and used to drive a parallel plan. Nor will a VALUES list. Nor will a TEMP table. So he would need to populate a real table with his 10,000 ids, and probably do `ALTER TABLE...SET(parallel_workers=8)`, to get it to work. – jjanes Jan 17 '21 at 18:30
  • @Sushant your track_io_timing did not seem to be effective. It can only be set by a superuser. I would just turn it on in the config file so that it is always on. In my experience, the overhead is low enough that there is no point in trying to be selective with it. – jjanes Jan 17 '21 at 18:34
  • @jjanes: a thanks. I wasn't aware of that limitation. –  Jan 17 '21 at 18:40
  • @jjanes Apologies, I have now updated Edit 2 with I/O timings - I/O Timings: read=55032.422. – Sushant Jan 17 '21 at 21:14
  • I'm surprised with the lack of speed for your storage. Getting just 10 records using a very simple query in 123ms, that's just slow. What kind of storage does your server have? – Frank Heikens Jan 18 '21 at 09:29

1 Answers1

1

So essentially all the time is spent reading the data from disk. Each row requires two reads, one for the index leaf page, and one for the table page. (The higher level blocks in the index are presumably mostly in cache). Some (unknown) fraction of them might be satisfied by the OS/filesystem cache.

384GB is a lot of memory. Maybe it is enough to hold the entire table and index, I don't know. (How big is the table and index?) But just having enough RAM to hold the data set is not enough. You would have to get the data into RAM in the first place (pg_prewarm could help with that) and you would have to convince the OS to keep it there rather than evict it in favor of something else (do you also have other large data sets which are frequently used?).

Of course the "easy" solution is to get faster storage hardware.

If you created a covering index which included all the columns of the table (you could use the INCLUDE syntax, so that you could still have just one index and use it both for this query and for uniqueness enforcement) that would enable index-only scans. This would cut the amount of IO in half, as then you would only need to read the index leaf page and not the table page for each row.

You could get parallel query if you put the ids into another table and then write the query as a join.

SELECT table_1b.* 
FROM table_1b JOIN id_list using (id)

Since the id_list table will be small (10,000 rows large for an IN list, but not for a table) you probably need to override its settings to get parallel query to actually be used. ALTER TABLE id_list set (parallel_workers=8). How effective this is would depend on your storage system. A single spinning HDD would likely see little improvement, while a SSD or a RAID of HDD would see more.

jjanes
  • 37,812
  • 5
  • 27
  • 34