I am aware of the index size limit of 767 Bytes when using UTF8, but I need a varchar field to be 512 characters, UTF8 and indexable so that I can update it using INSERT ON DUPLICATE KEY UPDATE. It works fine if the field is 255 characters, but I need to store Windows file paths that are greater than 255 characters, may also contain unicode and foreign characters. Any suggestion for a workaround would be very much appreciated.
Asked
Active
Viewed 226 times
0
-
If it works fine for 255 varchar field length that implies you are using the MySQL specific UTF-8 3-bytes rather than true UTF-8 which is 4 bytes per character. The maximum size for a unique indexed string using UTF-8 4-byte [which is the true UTF-8](http://stackoverflow.com/a/279279/3536236) is approximately 190 characters. If you are going to have "foreign" characters in your field you should be using the `UTF8mb4_` character set in MySQL – Martin Jul 25 '16 at 03:06
-
Thank you for correcting me on needing to use UTF8mb4. – rerat Jul 25 '16 at 03:38
2 Answers
1
You can store hash of your data on index column and store original data on another column.
SHA1('over 225 character') => 2c42692a0369f4a6060850f1a7997ae7b112d23b

user3089834
- 96
- 2
-
This worked perfectly. I created another field for the hash, use it for the index and then put the actual data in a 512 character field. – rerat Jul 25 '16 at 03:37
0
Two workarounds, one is change engine to TokuDB and use clustered index; Another is to use md5 of UTF8 field as key instead of UTF8 filed directly.

King
- 11
- 3