0

This question can be somehow continuation of MySQL unique 1500 varchar field error (#1071 - Specified key was too long), but briefly:

I applied changes to mysql configuaration and increased key/unique length to 3072 bytes

ROW_FORMAT=DYNAMIC, innodb_file_format = Barracuda, innodb_large_prefix = true

If I do:

CREATE TABLE IF NOT EXISTS `pages` (
  `link` varchar(1500) NOT NULL,
  `domain` varchar(255) NOT NULL,
  `lastvisited` datetime DEFAULT NULL,
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`id`),
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 AUTO_INCREMENT=1 ROW_FORMAT=DYNAMIC;

ALTER TABLE  `pages` ADD UNIQUE (
`link`
)

I got error: "#1071 - Specified key was too long; max key length is 3072 bytes" what is right, as link is varchar(1500) and is 4500 bytes.

Now, with same configuration, I decided to apply next table structure:

CREATE TABLE IF NOT EXISTS `pages` (
  `link` varchar(1500) CHARACTER SET ascii COLLATE ascii_bin NOT NULL,
  `domain` varchar(255) NOT NULL,
  `lastvisited` datetime DEFAULT NULL,
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`id`),
  UNIQUE KEY `link` (`link`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 AUTO_INCREMENT=1 ROW_FORMAT=DYNAMIC;

And I get error:

#1709 - Index column size too large. The maximum column size is 767 bytes.

All changes that were made is CHARACTER SET, but this decreased column size to 767 bytes. No changes to configuration. Why?

Community
  • 1
  • 1
Tigran
  • 1,049
  • 3
  • 15
  • 31
  • 1
    Just an idea, but instead of trying to ensure that a 1500 char URL is unique, why not also store a [hash](https://dev.mysql.com/doc/refman/5.5/en/encryption-functions.html#function_sha1) of the `URL` and then check just the hashes. Sure, some tiny chance of collision ... – StuartLC Feb 25 '14 at 11:24
  • Great idea! What about benefits? So when insert I just checking for hash, if 0 rows- insert, if some exists, select all and do next url checking. – Tigran Feb 25 '14 at 11:29
  • Well, the main benefit is speed - the hash will be a few bytes, which can be safely indexed. The downside is that it needs to be computed before an insert. – StuartLC Feb 25 '14 at 11:44
  • So my insert will be like, insert if not exists (select * where hash = '' and link = ''), Is there a way to optimize it? – Tigran Feb 25 '14 at 11:59
  • I suppose construction 'insert on dublicate key' will not work here, as I can not make hash a key. – Tigran Feb 25 '14 at 12:02
  • Hash would make a good key - the duplicate key should work fine. Unfortunately sqlfiddle seems dead at the moment. – StuartLC Feb 25 '14 at 12:21
  • What if 2 url's create same hash? The probality is small, but it is! – Tigran Feb 25 '14 at 15:48

0 Answers0