0

I have the following array with two names:

$x = ['Rodriguez', 'Rodríguez'];

Right now, What I'm trying to do is to insert these values to my Table, Which is like the following

Collation: utf8mb4_unicode_520_ci
Engine: InnoDB
[id(primary_ai)  -  name(unique)]

And I do connect to the database using:

<?php 
try {
    $pdo = new PDO("mysql:host=localhost;dbname=database;charset=utf8mb4;", 'root', 'root', [
        PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION,
        PDO::ATTR_DEFAULT_FETCH_MODE => PDO::FETCH_ASSOC,
        PDO::ATTR_EMULATE_PREPARES => false
    ]);
} catch (PDOException $e) {
    echo "Connection failed: " . $e->getMessage();
}

And I try to insert using like this

$paras = $x
$pdo->prepare("INSERT INTO names (name) VALUES (?), (?)")->execute($paras);

But I keep getting the error

SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry 'Rodríguez' for key 'name'

How can I deal with this encoding problem exactly?

I've tried using utf8_encode() but the second Name changed to Rodríguez, Then tried utf8_decode() but got Rodr�guez, Then tried adding PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES 'utf8'" but didn't solve the duplicate error either.

Toleo
  • 764
  • 1
  • 5
  • 19
  • Encode/decode functions are likely to make things worse, as you see. The two names _are_ spelled differently; one has a plain letter `i`, the other has an accute i (`í`). But the Collation says that both i's are treated equal. – Rick James Jul 19 '18 at 21:31

2 Answers2

1

You need to remove the unique limit from your DB or else have some other encoding that recognizes the difference between i and í.

This might work https://www.w3schools.com/html/html_entities.asp

Paddy Hallihan
  • 1,624
  • 3
  • 27
  • 76
  • How exactly do I remove the unique limit of the DB? – Toleo Jul 19 '18 at 16:27
  • Depends what you're using I use phpmyadmin so it's pretty easy or theres this https://stackoverflow.com/questions/3487691/dropping-unique-constraint-from-mysql-table – Paddy Hallihan Jul 19 '18 at 16:30
1

MySQL takes Rodriguez and Rodríguez the same thing, you have to change your collation.

You can fix collation by using:

CREATE TABLE Table (...) COLLATE utf8_bin;
Syed Daniyal Asif
  • 726
  • 1
  • 5
  • 19
  • What is the advantages and disadvantages of using `utf8_bin` or `utf8mb4_bin` over `utf8mb4_unicode_520_ci`, – Toleo Jul 19 '18 at 16:34
  • 1
    That will lead to even capitalization making a difference. Hope that is OK. – Rick James Jul 19 '18 at 21:28
  • @RickJames Does that mean that it is better to use `utf8mb4_bin collation` on columns require accuracy like `username`, `private keys`, or in some cases `email`, Yet `utf8mb4_unicode_520_ci` is better for columns that would be searched using `LIKE` or `WHERE` to ease the use to the `user` because of its `case insensitivity`? – Toleo Jul 20 '18 at 01:46
  • Collations are _aimed_ at flowing text. For user names do you want `abc` and `Abc` to be the same or different? You probably do want passwords to be fully sensitive, hence `utf8mb4_bin` or even `BINARY`. Isn't email case insensitive? But that about accents? – Rick James Jul 20 '18 at 04:33