0

I have a varchar field that can contain up to 225 chars and I need fast lookups that use this column in the WHERE clause.

So indexing would be a good idea but I heard that indexing large columns is slowing things down and one should create a separate column that stores a shorter hash of the first column and index this new column.

Then the query would do something like WHERE hash_column = hash(x) AND long_column = x.

Is this a good solution? And what hash should I use? The table is going to have a lot of entries (over millions). So what about collisions? What if the amount of entries that have the same hash become too much? Do they even become too much or are there hashes that won't collide even when there are billions of entries?

Is it enough to have this 1 index on the hash column or do I need to add any additional indices?

Thanks for your help

Edit: I use InnoDB.

Julius S.
  • 664
  • 9
  • 21
  • 1
    Have a look at this http://stackoverflow.com/questions/8001905/sql-server-worth-indexing-large-string-keys – rpd Feb 01 '17 at 11:03
  • @rpd Thank you, this was really helpful. I also read this detailed article which explains a lot: http://voluntarydba.com/post/2013/07/30/Efficiently-indexing-long-character-strings.aspx – Julius S. Feb 01 '17 at 11:16
  • Experience has shown that trial and error is the best way to handle things apart from obvious best practices. E.g try adding indexes on other columns as well if they are also heavily queried and experiment with execution times – rpd Feb 01 '17 at 11:21

0 Answers0