0

I have site with MySQL databse with a lot of data already in it and i've made a mistake with the encoding of the database and now all of the data is'nt readable via "PHPMyAdmin"

Now i fixed the problem with the encoding and all of the new data is readable but the old data dosen't, the problem was with the connection to the db because i didnt added "utf8" chatset...

The data OLD is readable if i'll remove the mysqli_set_charset($con,"utf8"); from the connection but then the NEW data is'nt readable.

Is there any way to convert the wrong encoded data to correct UTF8?

Image for example with GOOD and BAD lines: enter image description here

Saar Asor
  • 29
  • 8
  • Column data type? Locale? – jarlh Apr 07 '21 at 19:30
  • All of the wrong encoded column type are VARCHAR and the collation is utf8_general_ci – Saar Asor Apr 07 '21 at 19:37
  • You can try utf8_encode ughh even saying the name of this function is bad. This is of course if the previous charset was latin1 – Dharman Apr 07 '21 at 19:39
  • 1
    If the old data is readable one way, then read it that way and call `UPDATE` with the new way. This will be a one-off thing to fix your database so the code is going to feel wasteful and non-optimal, but it is what it is. – Chris Haas Apr 07 '21 at 19:50
  • The old is "corrputed" is'nt readable via the phpmyadmin but it's readable from the site (if the connection isn't configured to UTF8) i don't know what is the previous charset that the data is.... i've tried to export via phpmyadmin the database and i've selected every single one from the available chatset there and it's always unreadable... – Saar Asor Apr 07 '21 at 19:54
  • @ChrisHaas nice idea, i'll try to make it, i'll need to use 2 connctions to the db one to read and one to update – Saar Asor Apr 07 '21 at 19:55
  • 1
    Related: [UTF-8 all the way through](https://stackoverflow.com/questions/279170/). BTW, someone _could_ find a possible [mojibake](https://en.wikipedia.org/wiki/Mojibake) instance - if you add _textual_ examples of unreadable strings in your [mcve]… – JosefZ Apr 07 '21 at 19:59
  • You shouldn't need 2 connections to read and update. – Dave Apr 07 '21 at 20:22
  • @Dave how can i read non UTF8 and update with UTF8? that's exactly the problem i had... the connection wasn't UTF8 configured – Saar Asor Apr 07 '21 at 21:37
  • Slightly unrelated but close enough to be relevant: You really should use utf8mb4 instead of utf8 in MySQL, utf8 isn't a true utf8 implementation as it only uses three bytes. If your data can include high byte characters like emojis you'll have problems with reading and/or displaying the data. – h00ligan Apr 07 '21 at 22:15

1 Answers1

0

well, the best solution in my head is manually update,

take a database backup, configure and put this file in anywhere in your web server and use it.

edit i forgot to say add a new column named reconfigured to this table

<?php

$server = "localhost";
$username = "root";
$password = "";
$database = "database";

$table = "corrupted table";

//utf8
$con = mysqli_connect($server, $username, $password, $database);
mysqli_set_charset($con,"utf8");
//auto
$con2 = mysqli_connect($server, $username, $password, $database);

if (isset($_GET['id']) and isset($_GET['which'])) {
  switch($_GET['which']) {
    case 1: $correct_string = mysqli_query($con, "SELECT name FROM `".$table." WHERE `id`='".$_GET['id']."'"); $correct_string = mysqli_fetch_row($con, $correct_string); break;
    case 2: $correct_string = mysqli_query($con2, "SELECT name FROM `".$table." WHERE `id`='".$_GET['id']."'"); $correct_string = mysqli_fetch_row($con2, $correct_string); break;
  }
  mysqli_query($con, "UPDATE `".$table."` SET `name` = '".$correct_string['0']."' WHERE `".$table."`.`id` = '".$_GET['id']."'");
  mysqli_query($con, "UPDATE `".$table."` SET `reconfigured` = '1' WHERE `".$table."`.`id` = '".$_GET['id']."'");
}

$string1 = mysqli_query($con, "SELECT name FROM `".$table."` WHERE reconfigured = '0' LIMIT 1");
$string2 = mysqli_query($con2, "SELECT name FROM `".$table."` WHERE reconfigured = '0' LIMIT 1");
  $string1 = mysqli_fetch_row($con, $string1);
  $string2 = mysqli_fetch_row($con2, $string2);
  $string1 = $string1['0'];
  $string2 = $string2['0'];
?>


<form action="">

    <input type="text" id="id" name="id" value="<?= $id ?>"></input><br>
    
  <input type="radio" id="which" name="which" value="1"><?= $string1 ?><br>

  <input type="radio" id="which" name="which" value="2"><?= $string2 ?><br><br>
  
  <input type="submit" value="continue">

  </form>

i haven't tested this script, use with caution and take a database backup

edit, this way is automatic, if it has one of the broken words (like œ) it updates it with the other method, i double checked the code but you check too and take a database backup again, i kinda don't trust myself

<?php

$broken = array('×', '¢', 'œ', 'ª');
$server = "localhost";
$username = "root";
$password = "";
$database = "database";

$table = "corrupted table";

//utf8
$con = mysqli_connect($server, $username, $password, $database);
mysqli_set_charset($con,"utf8");
//auto
$con2 = mysqli_connect($server, $username, $password, $database);
$fixed = 0;
while ($fixed == 1) {
$string1 = mysqli_query($con, "SELECT name FROM `".$table."` WHERE reconfigured = '0' LIMIT 1");
$string2 = mysqli_query($con2, "SELECT name FROM `".$table."` WHERE reconfigured = '0' LIMIT 1");
  $string1 = mysqli_fetch_row($con, $string1);
  $string2 = mysqli_fetch_row($con2, $string2);
  if(implode(null,$string1) == null){ die('done'); }
  if(implode(null,$string2) == null){ die('done'); }
  $string1 = $string1['0'];
  $string2 = $string2['0'];
  
while ($found == 1) { 
foreach ($broken as $badword) {
if(strpos($string1, $badword)) { $found = 1; $broken = 1; }
} 
$found = 1;
}
if ($broken == 1) {
  mysqli_query($con, "UPDATE `".$table."` SET `name` = '".$string2."' WHERE `".$table."`.`id` = '".$id."'");

  
  
}
  mysqli_query($con, "UPDATE `".$table."` SET `reconfigured` = '1' WHERE `".$table."`.`id` = '".$id."'");
unset($broken);
unset($found);
}
?>
norahCii
  • 123
  • 1
  • 12
  • i just realized maybe i can automate that, can you copy paste me a unreadable row from name column? – norahCii Apr 07 '21 at 20:47
  • i did something like that, for each table i did seperate file for saftey, 2 db connections (1 with utf8 and the other one without) and loop read the original, then update with UTF8 connction - works perfectly :) thank you! – Saar Asor Apr 07 '21 at 21:33
  • `INSERT INTO orders (id, products, price, qty, ordernu) VALUES (10, 'נצמד שלישיה ניילון 300/45 ML', '79', '3', 4), (3, 'תיק וועידה ממותג', '12', '500', 2);` how it's should be: `INSERT INTO orders (id, products, price, qty, ordernum) VALUES (10, 'נצמד שלישיה ניילון 300/45 ML', '79', '3', 4), (3, 'תיק וועידה ממותג', '12', '500', 2);` – Saar Asor Apr 07 '21 at 21:35
  • 1
    i added the automatic way, don't forget to take a database backup again – norahCii Apr 07 '21 at 23:19