0

Let's assume I have long unique hashes (like 250-character long). I store them as the primary key. Will lookup for a certain row be much slower than if the primary key was based on an int?

ebvtrnog
  • 4,167
  • 4
  • 31
  • 59
  • possible duplicate of [Strings as Primary Keys in SQL Database](http://stackoverflow.com/questions/517579/strings-as-primary-keys-in-sql-database) – Fabien Thouraud Jun 01 '15 at 10:30

1 Answers1

1

I've made simple benchmark by preparing tables:

create table test_int(id integer primary key);
create table test_text(id text primary key);

insert into test_int select * from generate_series(1,1000000);
insert into test_text select md5(random()::text)||md5(random()::text)||md5(random()::text)||md5(random()::text)||md5(random()::text)||md5(random()::text)||md5(random()::text)||md5(random()::text)
    from generate_series(1,1000000);

analyze;

explain analyze
select * from test_int where id=<somevalue>;

Index Only Scan using test_int_pkey on test_int  (cost=0.42..8.44 rows=1 width=4) (actual time=0.023..0.024 rows=1 loops=1)
  Index Cond: (id = 58921)
  Heap Fetches: 1
Planning time: 0.129 ms
Execution time: 0.058 ms

explain analyze
select * from test_text where id=<somevalue>;

Index Only Scan using test_text_pkey on test_text  (cost=0.68..8.69 rows=1 width=260) (actual time=0.105..0.106 rows=1 loops=1)
  Index Cond: (id = '3e74346a6a060a208b3916c9b289d1a5020fa9ff3175fdb78022001a2e5f0857a70bddfbd0f148b11ec8928edd9ce008835c9b8624f6afb7b05e2d05ad9cc049418b046ea5e0d4814ffc34fc7fad476996c634353189140702148ba6c1f8fb055ca8ff18747bb3e62f3ca7189fb5165dff850c699e6 (...)
  Heap Fetches: 1
Planning time: 0.122 ms
Execution time: 0.140 ms

On my machine lookup by numeric id was approx twice as fast in comparison to text id;

What's more interesting is PK index size:

select indexname, pg_size_pretty(pg_relation_size(indexname::text)) from pg_indexes  where tablename in ('test_int', 'test_text');

"test_int_pkey"   "21 MB"
"test_text_pkey"  "420 MB"

So 250 chars long texts are certainly much slower than integer ids.

Radek Postołowicz
  • 4,506
  • 2
  • 30
  • 47