I have a large table with events, on which I have a couple of queries. These queries are way too slow for what they do. I have simplified one of the queries to this:
select distinct status_id FROM events e;
I know there are only 4 distinct status_id's so I tried to speed up the query by adding an index to the events table:
CREATE INDEX events_status_id_idx ON events USING btree (status_id);
However, it still takes well over 4 minutes to produce the 4 status_id's. The events table has quite a few partitions (around 150) and contains a total of around 500 million records. It took about 10 minutes to create the index.
Here is the query plan:
explain analyse select distinct status_id FROM events e;
HashAggregate (cost=14882839.34..14882841.34 rows=200 width=2) (actual time=368866.787..368866.832 rows=3 loops=1)
Group Key: e.status_id
-> Append (cost=0.00..13725451.72 rows=462955048 width=2) (actual time=0.137..255388.675 rows=462939088 loops=1)
-> Seq Scan on events_p1979 e (cost=0.00..237273.29 rows=9688029 width=2) (actual time=0.134..4687.666 rows=9688029 loops=1)
-> Seq Scan on events_p1980_1999 e_1 (cost=0.00..1992434.28 rows=81141128 width=2) (actual time=0.971..35872.877 rows=81141131 loops=1)
-> Seq Scan on events_p2000_2004 e_2 (cost=0.00..999619.68 rows=40738268 width=2) (actual time=0.596..17812.927 rows=40738267 loops=1)
-> Seq Scan on events_p2005_2009 e_3 (cost=0.00..1891528.88 rows=76520488 width=2) (actual time=1.003..33652.297 rows=76520489 loops=1)
-> Seq Scan on events_p2010_2014 e_4 (cost=0.00..3049149.88 rows=122807288 width=2) (actual time=0.953..53717.748 rows=122807285 loops=1)
-> Seq Scan on events_p2015 e_5 (cost=0.00..384993.18 rows=15630618 width=2) (actual time=15.734..6718.191 rows=15630618 loops=1)
-> Seq Scan on events_p2016 e_6 (cost=0.00..297953.57 rows=12142957 width=2) (actual time=0.823..5402.857 rows=12142957 loops=1)
-> Seq Scan on events_p2017 e_7 (cost=0.00..271325.19 rows=11080919 width=2) (actual time=0.620..4827.894 rows=11080919 loops=1)
-> Seq Scan on events_p2018 e_8 (cost=0.00..266106.62 rows=10866562 width=2) (actual time=0.862..4845.100 rows=10866562 loops=1)
-> Seq Scan on events_p2019 e_9 (cost=0.00..265646.49 rows=10850549 width=2) (actual time=0.786..4553.450 rows=10850549 loops=1)
-> Seq Scan on events_p202001 e_10 (cost=0.00..24421.23 rows=997123 width=2) (actual time=10.791..424.903 rows=997123 loops=1)
-> Seq Scan on events_p202002 e_11 (cost=0.00..22728.43 rows=927743 width=2) (actual time=0.807..428.602 rows=927743 loops=1)
-> Seq Scan on events_p202003 e_12 (cost=0.00..24486.46 rows=998746 width=2) (actual time=0.838..475.657 rows=998746 loops=1)
-> Seq Scan on events_p202004 e_13 (cost=0.00..23753.21 rows=968721 width=2) (actual time=0.739..430.996 rows=968721 loops=1)
-> Seq Scan on events_p202005 e_14 (cost=0.00..24373.64 rows=994264 width=2) (actual time=3.439..405.414 rows=994264 loops=1)
-> Seq Scan on events_p202006 e_15 (cost=0.00..22936.71 rows=936071 width=2) (actual time=5.127..371.514 rows=936071 loops=1)
-> Seq Scan on events_p202007 e_16 (cost=0.00..22587.46 rows=921846 width=2) (actual time=19.394..459.800 rows=921846 loops=1)
-> Seq Scan on events_p202008 e_17 (cost=0.00..24281.15 rows=991715 width=2) (actual time=0.610..449.880 rows=991715 loops=1)
-> Seq Scan on events_p202009 e_18 (cost=0.00..23276.41 rows=950741 width=2) (actual time=0.550..412.285 rows=950741 loops=1)
-> Seq Scan on events_p202010 e_19 (cost=0.00..23410.06 rows=956306 width=2) (actual time=0.632..397.671 rows=956306 loops=1)
-> Seq Scan on events_p202011 e_20 (cost=0.00..22854.35 rows=933535 width=2) (actual time=0.650..414.511 rows=933535 loops=1)
-> Seq Scan on events_p202012 e_21 (cost=0.00..24108.03 rows=984303 width=2) (actual time=0.638..412.849 rows=984303 loops=1)
-> Seq Scan on events_p202101 e_22 (cost=0.00..52312.43 rows=2007243 width=2) (actual time=0.693..995.275 rows=2007243 loops=1)
-> Seq Scan on events_p202102 e_23 (cost=0.00..45275.21 rows=1774921 width=2) (actual time=0.740..785.152 rows=1774921 loops=1)
-> Seq Scan on events_p202103 e_24 (cost=0.00..49430.63 rows=1954263 width=2) (actual time=0.733..869.778 rows=1954263 loops=1)
-> Seq Scan on events_p202104 e_25 (cost=0.00..45611.06 rows=1863506 width=2) (actual time=2.947..803.520 rows=1863506 loops=1)
-> Seq Scan on events_p202105 e_26 (cost=0.00..47438.34 rows=1938634 width=2) (actual time=0.724..910.150 rows=1938634 loops=1)
-> Seq Scan on events_p202106 e_27 (cost=0.00..46066.11 rows=1883211 width=2) (actual time=3.475..848.272 rows=1883211 loops=1)
-> Seq Scan on events_p202107 e_28 (cost=0.00..47400.35 rows=1938835 width=2) (actual time=1.219..753.470 rows=1938835 loops=1)
-> Seq Scan on events_p202108 e_29 (cost=0.00..47396.86 rows=1939786 width=2) (actual time=3.280..787.540 rows=1939786 loops=1)
-> Seq Scan on events_p202109 e_30 (cost=0.00..46001.68 rows=1883268 width=2) (actual time=0.749..800.181 rows=1883268 loops=1)
-> Seq Scan on events_p202110 e_31 (cost=0.00..47301.57 rows=1937257 width=2) (actual time=19.304..882.369 rows=1937257 loops=1)
-> Seq Scan on events_p202111 e_32 (cost=0.00..45997.61 rows=1883061 width=2) (actual time=11.621..826.250 rows=1883061 loops=1)
-> Seq Scan on events_p202112 e_33 (cost=0.00..47414.98 rows=1940698 width=2) (actual time=4.640..931.347 rows=1940698 loops=1)
-> Seq Scan on events_p202201 e_34 (cost=0.00..46750.78 rows=1913478 width=2) (actual time=0.701..843.795 rows=1913478 loops=1)
-> Seq Scan on events_p202202 e_35 (cost=0.00..41697.20 rows=1706220 width=2) (actual time=0.666..731.245 rows=1706220 loops=1)
-> Seq Scan on events_p202203 e_36 (cost=0.00..46453.57 rows=1899657 width=2) (actual time=0.836..844.158 rows=1899657 loops=1)
-> Seq Scan on events_p202204 e_37 (cost=0.00..45139.87 rows=1845587 width=2) (actual time=9.452..757.667 rows=1845587 loops=1)
-> Seq Scan on events_p202205 e_38 (cost=0.00..42529.02 rows=1738402 width=2) (actual time=4.410..741.069 rows=1738402 loops=1)
-> Seq Scan on events_p202206 e_39 (cost=0.00..37520.06 rows=1534506 width=2) (actual time=4.289..661.865 rows=1534506 loops=1)
-> Seq Scan on events_p202207 e_40 (cost=0.00..38306.62 rows=1568062 width=2) (actual time=4.993..698.376 rows=1568062 loops=1)
-> Seq Scan on events_p202208 e_41 (cost=0.00..38257.64 rows=1566864 width=2) (actual time=25.921..906.763 rows=1566864 loops=1)
-> Seq Scan on events_p202209 e_42 (cost=0.00..37003.20 rows=1516320 width=2) (actual time=12.365..1117.421 rows=1516320 loops=1)
-> Seq Scan on events_p202210 e_43 (cost=0.00..38222.70 rows=1566870 width=2) (actual time=7.244..693.667 rows=1566870 loops=1)
-> Seq Scan on events_p202211 e_44 (cost=0.00..36989.20 rows=1516320 width=2) (actual time=8.100..638.007 rows=1516320 loops=1)
-> Seq Scan on events_p202212 e_45 (cost=0.00..38231.02 rows=1566702 width=2) (actual time=12.147..755.076 rows=1566702 loops=1)
-> Seq Scan on events_p202301 e_46 (cost=0.00..35298.36 rows=1446336 width=2) (actual time=5.985..682.974 rows=1446336 loops=1)
-> Seq Scan on events_p202302 e_47 (cost=0.00..31903.68 rows=1306368 width=2) (actual time=10.769..589.002 rows=1306368 loops=1)
-> Seq Scan on events_p202303 e_48 (cost=0.00..35325.36 rows=1446336 width=2) (actual time=6.447..719.256 rows=1446336 loops=1)
-> Seq Scan on events_p202304 e_49 (cost=0.00..34196.80 rows=1399680 width=2) (actual time=9.787..620.882 rows=1399680 loops=1)
-> Seq Scan on events_p202305 e_50 (cost=0.00..35339.36 rows=1446336 width=2) (actual time=2.720..660.433 rows=1446336 loops=1)
-> Seq Scan on events_p202306 e_51 (cost=0.00..34196.80 rows=1399680 width=2) (actual time=9.507..612.300 rows=1399680 loops=1)
-> Seq Scan on events_p202307 e_52 (cost=0.00..35325.36 rows=1446336 width=2) (actual time=11.261..622.895 rows=1446336 loops=1)
-> Seq Scan on events_p202308 e_53 (cost=0.00..35309.36 rows=1446336 width=2) (actual time=0.974..589.677 rows=1446336 loops=1)
-> Seq Scan on events_p202309 e_54 (cost=0.00..34151.80 rows=1399680 width=2) (actual time=0.879..619.585 rows=1399680 loops=1)
-> Seq Scan on events_p202310 e_55 (cost=0.00..35277.36 rows=1446336 width=2) (actual time=6.213..567.483 rows=1446336 loops=1)
-> Seq Scan on events_p202311 e_56 (cost=0.00..34131.80 rows=1399680 width=2) (actual time=0.862..590.950 rows=1399680 loops=1)
-> Seq Scan on events_p202312 e_57 (cost=0.00..35224.93 rows=1444393 width=2) (actual time=8.154..670.012 rows=1444393 loops=1)
-> Seq Scan on events_p202401 e_58 (cost=0.00..11.90 rows=190 width=2) (actual time=0.015..0.016 rows=0 loops=1)
-> Seq Scan on events_p202402 e_59 (cost=0.00..11.90 rows=190 width=2) (actual time=0.016..0.016 rows=0 loops=1)
-> Seq Scan on events_p202403 e_60 (cost=0.00..11.90 rows=190 width=2) (actual time=0.005..0.005 rows=0 loops=1)
-> Seq Scan on events_p202404 e_61 (cost=0.00..11.90 rows=190 width=2) (actual time=0.009..0.009 rows=0 loops=1)
-> Seq Scan on events_p202405 e_62 (cost=0.00..11.90 rows=190 width=2) (actual time=0.012..0.012 rows=0 loops=1)
-> Seq Scan on events_p202406 e_63 (cost=0.00..11.90 rows=190 width=2) (actual time=0.007..0.007 rows=0 loops=1)
-> Seq Scan on events_p202407 e_64 (cost=0.00..11.90 rows=190 width=2) (actual time=0.007..0.007 rows=0 loops=1)
-> Seq Scan on events_p202408 e_65 (cost=0.00..11.90 rows=190 width=2) (actual time=0.011..0.011 rows=0 loops=1)
-> Seq Scan on events_p202409 e_66 (cost=0.00..11.90 rows=190 width=2) (actual time=0.004..0.004 rows=0 loops=1)
-> Seq Scan on events_p202410 e_67 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.003 rows=0 loops=1)
-> Seq Scan on events_p202411 e_68 (cost=0.00..11.90 rows=190 width=2) (actual time=0.008..0.008 rows=0 loops=1)
-> Seq Scan on events_p202412 e_69 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.003 rows=0 loops=1)
-> Seq Scan on events_p202501 e_70 (cost=0.00..11.90 rows=190 width=2) (actual time=0.008..0.008 rows=0 loops=1)
-> Seq Scan on events_p202502 e_71 (cost=0.00..11.90 rows=190 width=2) (actual time=0.008..0.008 rows=0 loops=1)
-> Seq Scan on events_p202503 e_72 (cost=0.00..11.90 rows=190 width=2) (actual time=0.004..0.004 rows=0 loops=1)
-> Seq Scan on events_p202504 e_73 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.003 rows=0 loops=1)
-> Seq Scan on events_p202505 e_74 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.003 rows=0 loops=1)
-> Seq Scan on events_p202506 e_75 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.003 rows=0 loops=1)
-> Seq Scan on events_p202507 e_76 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.003 rows=0 loops=1)
-> Seq Scan on events_p202508 e_77 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.003 rows=0 loops=1)
-> Seq Scan on events_p202509 e_78 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.004 rows=0 loops=1)
-> Seq Scan on events_p202510 e_79 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.003 rows=0 loops=1)
-> Seq Scan on events_p202511 e_80 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.003 rows=0 loops=1)
-> Seq Scan on events_p202512 e_81 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.003 rows=0 loops=1)
-> Seq Scan on events_p202601 e_82 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.003 rows=0 loops=1)
-> Seq Scan on events_p202602 e_83 (cost=0.00..11.90 rows=190 width=2) (actual time=0.009..0.009 rows=0 loops=1)
-> Seq Scan on events_p202603 e_84 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.004 rows=0 loops=1)
-> Seq Scan on events_p202604 e_85 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.003 rows=0 loops=1)
-> Seq Scan on events_p202605 e_86 (cost=0.00..11.90 rows=190 width=2) (actual time=0.004..0.004 rows=0 loops=1)
-> Seq Scan on events_p202606 e_87 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.003 rows=0 loops=1)
-> Seq Scan on events_p202607 e_88 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.003 rows=0 loops=1)
-> Seq Scan on events_p202608 e_89 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.003 rows=0 loops=1)
-> Seq Scan on events_p202609 e_90 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.003 rows=0 loops=1)
-> Seq Scan on events_p202610 e_91 (cost=0.00..11.90 rows=190 width=2) (actual time=0.008..0.008 rows=0 loops=1)
-> Seq Scan on events_p202611 e_92 (cost=0.00..11.90 rows=190 width=2) (actual time=0.007..0.007 rows=0 loops=1)
-> Seq Scan on events_p202612 e_93 (cost=0.00..11.90 rows=190 width=2) (actual time=0.008..0.008 rows=0 loops=1)
-> Seq Scan on events_p202701 e_94 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.003 rows=0 loops=1)
-> Seq Scan on events_p202702 e_95 (cost=0.00..11.90 rows=190 width=2) (actual time=0.012..0.012 rows=0 loops=1)
-> Seq Scan on events_p202703 e_96 (cost=0.00..11.90 rows=190 width=2) (actual time=0.012..0.012 rows=0 loops=1)
-> Seq Scan on events_p202704 e_97 (cost=0.00..11.90 rows=190 width=2) (actual time=0.012..0.012 rows=0 loops=1)
-> Seq Scan on events_p202705 e_98 (cost=0.00..11.90 rows=190 width=2) (actual time=0.008..0.008 rows=0 loops=1)
-> Seq Scan on events_p202706 e_99 (cost=0.00..11.90 rows=190 width=2) (actual time=0.011..0.011 rows=0 loops=1)
-> Seq Scan on events_p202707 e_100 (cost=0.00..11.90 rows=190 width=2) (actual time=0.004..0.004 rows=0 loops=1)
-> Seq Scan on events_p202708 e_101 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.003 rows=0 loops=1)
-> Seq Scan on events_p202709 e_102 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.003 rows=0 loops=1)
-> Seq Scan on events_p202710 e_103 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.003 rows=0 loops=1)
-> Seq Scan on events_p202711 e_104 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.003 rows=0 loops=1)
-> Seq Scan on events_p202712 e_105 (cost=0.00..11.90 rows=190 width=2) (actual time=0.009..0.009 rows=0 loops=1)
-> Seq Scan on events_p202801 e_106 (cost=0.00..11.90 rows=190 width=2) (actual time=0.008..0.008 rows=0 loops=1)
-> Seq Scan on events_p202802 e_107 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.003 rows=0 loops=1)
-> Seq Scan on events_p202803 e_108 (cost=0.00..11.90 rows=190 width=2) (actual time=0.004..0.004 rows=0 loops=1)
-> Seq Scan on events_p202804 e_109 (cost=0.00..11.90 rows=190 width=2) (actual time=0.009..0.009 rows=0 loops=1)
-> Seq Scan on events_p202805 e_110 (cost=0.00..11.90 rows=190 width=2) (actual time=0.007..0.007 rows=0 loops=1)
-> Seq Scan on events_p202806 e_111 (cost=0.00..11.90 rows=190 width=2) (actual time=0.008..0.008 rows=0 loops=1)
-> Seq Scan on events_p202807 e_112 (cost=0.00..11.90 rows=190 width=2) (actual time=0.008..0.008 rows=0 loops=1)
-> Seq Scan on events_p202808 e_113 (cost=0.00..11.90 rows=190 width=2) (actual time=0.007..0.007 rows=0 loops=1)
-> Seq Scan on events_p202809 e_114 (cost=0.00..11.90 rows=190 width=2) (actual time=0.008..0.008 rows=0 loops=1)
-> Seq Scan on events_p202810 e_115 (cost=0.00..11.90 rows=190 width=2) (actual time=0.008..0.008 rows=0 loops=1)
-> Seq Scan on events_p202811 e_116 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.003 rows=0 loops=1)
-> Seq Scan on events_p202812 e_117 (cost=0.00..11.90 rows=190 width=2) (actual time=0.008..0.009 rows=0 loops=1)
-> Seq Scan on events_p202901 e_118 (cost=0.00..11.90 rows=190 width=2) (actual time=0.007..0.007 rows=0 loops=1)
-> Seq Scan on events_p202902 e_119 (cost=0.00..11.90 rows=190 width=2) (actual time=0.004..0.004 rows=0 loops=1)
-> Seq Scan on events_p202903 e_120 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.003 rows=0 loops=1)
-> Seq Scan on events_p202904 e_121 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.003 rows=0 loops=1)
-> Seq Scan on events_p202905 e_122 (cost=0.00..11.90 rows=190 width=2) (actual time=0.013..0.013 rows=0 loops=1)
-> Seq Scan on events_p202906 e_123 (cost=0.00..11.90 rows=190 width=2) (actual time=0.013..0.013 rows=0 loops=1)
-> Seq Scan on events_p202907 e_124 (cost=0.00..11.90 rows=190 width=2) (actual time=0.008..0.008 rows=0 loops=1)
-> Seq Scan on events_p202908 e_125 (cost=0.00..11.90 rows=190 width=2) (actual time=0.009..0.009 rows=0 loops=1)
-> Seq Scan on events_p202909 e_126 (cost=0.00..11.90 rows=190 width=2) (actual time=0.011..0.011 rows=0 loops=1)
-> Seq Scan on events_p202910 e_127 (cost=0.00..11.90 rows=190 width=2) (actual time=0.013..0.013 rows=0 loops=1)
-> Seq Scan on events_p202911 e_128 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.003 rows=0 loops=1)
-> Seq Scan on events_p202912 e_129 (cost=0.00..11.90 rows=190 width=2) (actual time=0.007..0.007 rows=0 loops=1)
-> Seq Scan on events_p203001 e_130 (cost=0.00..11.90 rows=190 width=2) (actual time=0.011..0.011 rows=0 loops=1)
-> Seq Scan on events_p203002 e_131 (cost=0.00..11.90 rows=190 width=2) (actual time=0.007..0.008 rows=0 loops=1)
-> Seq Scan on events_p203003 e_132 (cost=0.00..11.90 rows=190 width=2) (actual time=0.007..0.007 rows=0 loops=1)
-> Seq Scan on events_p203004 e_133 (cost=0.00..11.90 rows=190 width=2) (actual time=0.007..0.007 rows=0 loops=1)
-> Seq Scan on events_p203005 e_134 (cost=0.00..11.90 rows=190 width=2) (actual time=0.004..0.004 rows=0 loops=1)
-> Seq Scan on events_p203006 e_135 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.003 rows=0 loops=1)
-> Seq Scan on events_p203007 e_136 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.004 rows=0 loops=1)
-> Seq Scan on events_p203008 e_137 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.003 rows=0 loops=1)
-> Seq Scan on events_p203009 e_138 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.003 rows=0 loops=1)
-> Seq Scan on events_p203010 e_139 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.003 rows=0 loops=1)
-> Seq Scan on events_p203011 e_140 (cost=0.00..11.90 rows=190 width=2) (actual time=0.003..0.003 rows=0 loops=1)
-> Seq Scan on events_p203012 e_141 (cost=0.00..11.90 rows=190 width=2) (actual time=0.008..0.008 rows=0 loops=1)
Planning Time: 16.109 ms
Execution Time: 368867.504 ms
It does a Seq Scan on each of my partitions, while there actually is an index available. I have also tried querying just one of the larger partitions in a separate query, but that takes over a minute as well, with a similar execution plan. Any idea's to make postgres actually use the index?