39

From what information I could find, they both solve the same problems - more esoteric operations like array containment and intersection (&&, @>, <@, etc). However I would be interested in advice about when to use one or the other (or neither possibly).
The PostgreSQL documentation has some information about this:

  • GIN index lookups are about three times faster than GiST
  • GIN indexes take about three times longer to build than GiST
  • GIN indexes are about ten times slower to update than GiST
  • GIN indexes are two-to-three times larger than GiST

However I would be particularly interested to know if there is a performance impact when the memory to index size ration starts getting small (ie. the index size becomes much bigger than the available memory)? I've been told on the #postgresql IRC channel that GIN needs to keep all the index in memory, otherwise it won't be effective, because, unlike B-Tree, it doesn't know which part to read in from disk for a particular query? The question would be: is this true (because I've also been told the opposite of this)? Does GiST have the same restrictions? Are there other restrictions I should be aware of while using one of these indexing algorithms?

Charles
  • 50,943
  • 13
  • 104
  • 142
Grey Panther
  • 12,870
  • 6
  • 46
  • 64
  • 1
    These measurements vary according to which version of PG you're using. I just ran a test under 9.6.1 against an hstore column and got the following: GiST took 4.1 times longer to build than GIN and was about 170 times slower for lookups. Not even close to the figures you've quoted. – IamIC Dec 28 '16 at 20:25

1 Answers1

18

First of all, do you need to use them for text search indexing? GIN and GiST are index specialized for some data types. If you need to index simple char or integer values then the normal B-Tree index is the best.
Anyway, PostgreSQL documentation has a chapter on GIST and one on GIN, where you can find more info.
And, last but not least, the best way to find which is best is to generate sample data (as much as you need to be a real scenario) and then create a GIST index, measuring how much time is needed to create the index, insert a new value, execute a sample query. Then drop the index and do the same with a GIN index. Compare the values and you will have the answer you need, based on your data.

Andrea Bertani
  • 1,627
  • 12
  • 11
  • 9
    Yes, but the problem being time, it is more wise to ask in case someone already knows... – Nicholas Leonard Jan 12 '09 at 00:01
  • 15
    I think that generating you own data is the the only answer. I spent a few days reading all I could in the context of full-text and spatial about gist and gin indexes, inserting with indexes on or off, wal settings in postgres.sql and talking to people on #postgis/#postgresql and eventually came to the conclusion that the only way to be really sure, for my very specific usage, was to load test tables with large subsets of my actual data, and test. If you rush during setup to save a day here or there, you will end up wasting weeks over the following years, I have learnt by hard experience. – John Powell Sep 20 '11 at 20:05