I have a varchar field that can contain up to 225 chars and I need fast lookups that use this column in the WHERE
clause.
So indexing would be a good idea but I heard that indexing large columns is slowing things down and one should create a separate column that stores a shorter hash of the first column and index this new column.
Then the query would do something like WHERE hash_column = hash(x) AND long_column = x
.
Is this a good solution? And what hash should I use? The table is going to have a lot of entries (over millions). So what about collisions? What if the amount of entries that have the same hash become too much? Do they even become too much or are there hashes that won't collide even when there are billions of entries?
Is it enough to have this 1 index on the hash column or do I need to add any additional indices?
Thanks for your help
Edit: I use InnoDB.