2

I could not easily find that information in the docs:

I started all my tables with utf8_bin, now I have one table with a column I would prefer to change to utf8_mb4_900_as_ci and consider utf8_mb4_900_ai_ci.

My question: Would this negatively affect JOINs with utf8_bin fields in terms of index use and performance ?

John
  • 7,507
  • 3
  • 52
  • 52
  • My understanding is the index performance would not be impacted. However the matching of records will be, depending on the actual data being joined. As one will be case-sensitive, another will be accent-sensitive and another that is neither. eg: `A <> a` vs `ü <> u` So `JOIN ON ü = u` will produce results for the `ai_ci` where `as_ci` and `bin` will not. Additionally considering utf8(mb3) vs utf8mb4, will also not be identical due to the extra characters that mb4 supports (non-BMP or supplemental characters stored with 4 bytes, where utf8_bin would support only 3 bytes). – Will B. May 28 '20 at 02:40
  • 1
    Does this answer your question? [Performance difference between table collations in MySQL](https://stackoverflow.com/questions/44954498/performance-difference-between-table-collations-in-mysql) – kmoser May 28 '20 at 02:46
  • 1
    Voting to reopen because I think the 'closers' misinterpreted the question. – Rick James May 31 '20 at 18:19
  • Please provide a specific example of the `SELECT` with the `JOIN` in question. Also provide `SHOW CREATE TABLE` for the two tables as they would look after the `ALTER`. – Rick James May 31 '20 at 18:53

1 Answers1

3

If your change leads to a JOIN between two columns with different COLLATIONs, performance may suffer hugely.

If both columns are exactly the same character set and collation, the JOIN can use an INDEX on one or the other column.

If the columns differ in CHARACTER SET (and hence COLLATION), then one column may be convertible to the other.

(Please provide a specific example.)

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • Previously my query did not use the indices it had for varchar on both sides. I have adjusted the collation and now execution plan shows Unique Key Lookup instead of Full Scan. And the performance indeed improved _hugely_. Thanks! – Kuba D Aug 09 '23 at 12:03