2
  • I have a first column typed as varchar(190) that is using utf8mb4_bin collation.

When I perform the following query I only get back all of Joe as expected:

SELECT first, last FROM person WHERE first = 'Joe'

What I would like to get is Joe, joe, jOe, joE, jOE, JoE, JOE, and JOe. Basically a case-insensitive search on a case-sensitive field.

How do I do this?

CREATE TABLE `person` (
  `id` int NOT NULL AUTO_INCREMENT,
  `first` varchar(190) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL,
  `middle` varchar(190) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL,
  `last` varchar(190) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL,
  `job` varchar(190) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `id_UNIQUE` (`id`),
  UNIQUE KEY `names_unq` (`first`,`middle`,`last`,`job`),
  KEY `index_job` (`job`),
  KEY `index_first` (`first`,`job`),
  KEY `index_first_last` (`first`,`last`,`job`),
  KEY `index_middle` (`middle`,`job`),
  KEY `index_last` (`job`,`last`)
) ENGINE=InnoDB AUTO_INCREMENT=99750823 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
Dai
  • 141,631
  • 28
  • 261
  • 374
Ethan Allen
  • 14,425
  • 24
  • 101
  • 194
  • 1
    Does this answer your question? [How can I search (case-insensitive) in a column using LIKE wildcard?](https://stackoverflow.com/questions/2876789/how-can-i-search-case-insensitive-in-a-column-using-like-wildcard) – sajjad rezaei Mar 12 '22 at 04:05
  • 1
    `utf8mb4_bin` is a collation, not a column-type. What is the **exact** column-type? `binary`? `varbinary`? `blob`? `varchar`? Please post your raw `CREATE TABLE` statement. – Dai Mar 12 '22 at 04:09
  • Does this answer your question? [How can I make SQL case sensitive string comparison on MySQL?](https://stackoverflow.com/questions/5629111/how-can-i-make-sql-case-sensitive-string-comparison-on-mysql) – kmoser Mar 12 '22 at 04:12
  • @Dai - I understand that. They are VARCHAR. But as you can see in my CREATE statement which I posted, you can set a collation type per column in MySQL. That is what I was referring to. – Ethan Allen Mar 12 '22 at 04:15
  • But your question says "against a binary table", but you don't actually have any `binary` columns... – Dai Mar 12 '22 at 04:16
  • @Dai the collation type `utf8mb4_bin` is a binary collation (hence the _bin suffix) which is what makes it case-sensitive. See: https://dev.mysql.com/doc/refman/8.0/en/charset-binary-collations.html – Ethan Allen Mar 12 '22 at 04:22
  • I've reworded your question. – Dai Mar 12 '22 at 04:23
  • @Dai - You're right the question was confusing. Thank you. – Ethan Allen Mar 12 '22 at 04:24

3 Answers3

4

You can specify a collation in a string comparison expression to override the collation used in the comparison. Read https://dev.mysql.com/doc/refman/8.0/en/charset-literal.html for more details on this.

CREATE TABLE `person` (
  `first` text COLLATE utf8mb4_bin,
  `last` text COLLATE utf8mb4_bin
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin

mysql> select first, last from person where first = 'Joe';
+-------+-------+
| first | last  |
+-------+-------+
| Joe   | Grant |
+-------+-------+

mysql> select first, last from person where first = 'joe';
Empty set (0.00 sec)

mysql> select first, last from person where first = 'joe' collate utf8mb4_unicode_ci;
+-------+-------+
| first | last  |
+-------+-------+
| Joe   | Grant |
+-------+-------+
Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • @Akina I wrote my comment reply before Bill updated their answer to add it, I've now deleted my comment. And don't forget that what might seem obvious or implicit to you certainly won't be to someone who is new or inexperienced with SQL or MySQL. – Dai Mar 12 '22 at 10:54
0

use "collate utf8mb4_unicode_ci" as it makes one-to-one comparison with character to whatever the filter condition you have given.

K D
  • 51
  • 4
0

The simplest way to do this is to use UPPER().
NB: This is not optimised (unless there is an index UPPER() which is unlikely) but it is a quick fix for simple queries. I do not recommend it for large scale or production queries without testing the speed/cost.

SELECT first, last FROM person WHERE UPPER(first) = 'JOE';

If you are matching a parameter you might need to use upper on both side as in

SELECT first, last FROM person WHERE UPPER(first) = UPPER(@name);
  • _This is a bad idea_ because using functions in predicates means the query won't be SARGable unless you add an index on a function's output over a query (assuming the RDBMS even supports that, some don't). – Dai Mar 12 '22 at 10:56
  • Is `collate utf8mb4_unicode_ci` SARG able? It would be interesting to know which costs the most. I'm suggesting this as a simple solution for small one off queries and not for large scale production one. –  Mar 12 '22 at 11:08
  • The only way to know is by comparing execution-plans - it's entirely possible that an RDBMS will detect the use of `UPPER` _as an idiom_ and treat it like a `collate` query, but I wouldn't bet on it. – Dai Mar 12 '22 at 11:12