For that exact query, not really; doing this operation requires scanning every row. No way around it.
But if you're looking to be able to quickly get the set of unique categories, and you have an index on that column, you can use a variation of the WITH RECURSIVE
example shown in the edit to the question here (look towards the end of the question):
Counting distinct rows using recursive cte over non-distinct index
You'll need to change it to return the set of unique values instead of counting them, but it looks like a simple change:
testdb=# create table tb(id bigserial, category smallint);
CREATE TABLE
testdb=# insert into tb(category) select 2 from generate_series(1, 10000)
testdb-# ;
INSERT 0 10000
testdb=# insert into tb(category) select 1 from generate_series(1, 10000);
INSERT 0 10000
testdb=# insert into tb(category) select 3 from generate_series(1, 10000);
INSERT 0 10000
testdb=# create index on tb(category);
CREATE INDEX
testdb=# WITH RECURSIVE cte AS
(
(SELECT category
FROM tb
WHERE category >= 0
ORDER BY 1
LIMIT 1)
UNION ALL SELECT
(SELECT category
FROM tb
WHERE category > c.category
ORDER BY 1
LIMIT 1)
FROM cte c
WHERE category IS NOT NULL)
SELECT category
FROM cte
WHERE category IS NOT NULL;
category
----------
1
2
3
(3 rows)
And here's the EXPLAIN ANALYZE
:
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
CTE Scan on cte (cost=40.66..42.68 rows=100 width=2) (actual time=0.057..0.127 rows=3 loops=1)
Filter: (category IS NOT NULL)
Rows Removed by Filter: 1
CTE cte
-> Recursive Union (cost=0.29..40.66 rows=101 width=2) (actual time=0.052..0.119 rows=4 loops=1)
-> Limit (cost=0.29..0.33 rows=1 width=2) (actual time=0.051..0.051 rows=1 loops=1)
-> Index Only Scan using tb_category_idx on tb tb_1 (cost=0.29..1363.29 rows=30000 width=2) (actual time=0.050..0.050 rows=1 loops=1)
Index Cond: (category >= 0)
Heap Fetches: 1
-> WorkTable Scan on cte c (cost=0.00..3.83 rows=10 width=2) (actual time=0.015..0.015 rows=1 loops=4)
Filter: (category IS NOT NULL)
Rows Removed by Filter: 0
SubPlan 1
-> Limit (cost=0.29..0.36 rows=1 width=2) (actual time=0.016..0.016 rows=1 loops=3)
-> Index Only Scan using tb_category_idx on tb (cost=0.29..755.95 rows=10000 width=2) (actual time=0.015..0.015 rows=1 loops=3)
Index Cond: (category > c.category)
Heap Fetches: 2
Planning time: 0.224 ms
Execution time: 0.191 ms
(19 rows)
The number of loops it has to do the WorkTable
scan node will be equal to the number of unique categories you have plus one, so it should stay very fast up to, say, hundreds of unique values.
Another route you can take is to add another table where you just store unique values of tb.category
and have application logic check that table and insert their value when updating/inserting that column. This can also be done database-side with triggers; that solution is also discussed in the answers to the linked question.