3

I'm trying to switch one of my DB tables over to utf8mb4 from utf8_general_ci.

ALTER TABLE d4b80le1jha CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin

This brings up the following error:

1071 - Specified key was too long; max key length is 1000 bytes

I've read a number of answers about this error messages but can't seem to solve my problem. Most of them talk about defining new indexes, but not what to do about existing ones.

I've also tried running the following, as recommended here, which made no difference:

SET @@global.innodb_large_prefix = 1;

An aggrevating factor is probably that I am using varchar hashes as primary keys. This is a legacy feature outside of my control. I have not set explicit sizes on my indexes - I wonder if that's what I should be looking at.

Table:

Field           Type            Null    Key     Default
-------------------------------------------------------
id              varchar(11)     NO      PRI     NULL    
link            varchar(255)    NO              NULL    
title           varchar(255)    NO              NULL    
description     varchar(255)    YES             NULL    
pubdate         datetime        NO              NULL    
img_url         varchar(255)    YES             NULL    
team_id         varchar(11)     NO      MUL     NULL    
source_id       varchar(11)     NO              NULL    
hits            int(11)         YES             NULL

Indexes:

Keyname         Type    Unique  Packed  Column      Cardinality Collation   Null
--------------------------------------------------------------------------------
PRIMARY         BTREE   Yes         No  id          13407       A           No
Unique combo    BTREE   Yes         No  team_id                 A           No
                                        source_id               A           No
                                        link        13407       A           No
Mitya
  • 33,629
  • 9
  • 60
  • 107
  • 1
    I just ran into this error message myself. For me, switching the database engine from `MyISAM` to `InnoDB` allowed for a larger index size to accommodate the `utf8mb4` character set. Reducing the column and index size is a non-starter for me and I don't want to be monkeying around with character sets at the column level. Since the MyISAM engine is going to be removed from MySQL in the future, switching to InnoDB was necessary anyway. – CubicleSoft Jun 17 '21 at 15:13

2 Answers2

3

It's probably your Unique combo index causing trouble, I mean your index on (team_id, source_id, link). The total length in characters of that index is 277, so its total maximum length in bytes is four times that: 1108.

What can you do about this?

  • reduce the length in characters of your link column to 228, so the max length in characters of your index is 250. That's probably easiest as long as your values still fit.

  • recreate your index mentioning link(228) instead of just link to index just the leading characters of link. I don't know if this is a good idea; the purpose of your index is to define a unique constraint. If you don't index the entire value of link you aren't exactly doing that.

  • Instead of converting all the character columns in your table to utfmb4 character encoding, just convert the title and description columns. This might work for you. Why? Your _id hashes probably use a limited character set and simply don't need unicode. Your link column values are probably URLS and also can use a limited (latin1) character set. So, just title and description may need to be written in Hebrew or Chinese or contain emojis or whatever. This is probably your best solution.

(Notice that character encoding is actually specified column-by-column. The ability to convert an entire table is provided for convenience.)

So you can do

ALTER TABLE d4b80le1jha 
     MODIFY title VARCHAR(255) 
                  CHARACTER SET utf8mb4
                  COLLATE utf8mb4_bin;
ALTER TABLE d4b80le1jha 
     MODIFY description VARCHAR(255) 
                        CHARACTER SET utf8mb4
                        COLLATE utf8mb4_bin;

Finally, because these two columns are human readable text, you might want to use utf8mb4_general_ci for the collation of those columns rather than utf8mb4_bin. Searching and ordering will probably meet the expectations of your users a little better.

O. Jones
  • 103,626
  • 17
  • 118
  • 172
2

There are 5 steps for fixing that problem that existed in 5.5 and 5.6, but has been defaulted away in 5.7.

If you are hitting the limit because of trying to use CHARACTER SET utf8mb4. Here's one approach:

To raise the limit to 3072 bytes:

SET GLOBAL innodb_file_format=Barracuda;
SET GLOBAL innodb_file_per_table=1;
SET GLOBAL innodb_large_prefix=1;
logout & login (to get the global values);
ALTER TABLE tbl ROW_FORMAT=DYNAMIC;  -- (or COMPRESSED)

A few more are given in http://mysql.rjweb.org/doc.php/limits#767_limit_in_innodb_indexes

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • "do one of the following" is very strange here. Were you aware of what you are saying, or just *copy-past something* is good enough for anything? ;) – chang zhao Nov 28 '18 at 11:57
  • @changzhao - Oops. Thanks for pointing it out. Yes, it was a copy-paste mistake. – Rick James Nov 28 '18 at 16:10