0

Hi I recently changed the hosting provider for my website. When doing this I exported the mysql database I had in my previous cpanel phpmyadmin. It had CHARACTER SET latin1 and COLLATE latin1_swedish_ci. After I importing it to my new phpmyadmin I saw there was an issue with displaying the characters written in Czech ě ř č ů which appeared as question mark or weird symbols etc. I also wasn't able to insert these letters at first but after changing the table CHARSET to utf8 I'm able to insert them. But how do I export the data from my old database and import it in the new one without messing up the data? Here's what the database looks like:

    SET SQL_MODE = "NO_AUTO_VALUE_ON_ZERO";
    SET AUTOCOMMIT = 0;
    START TRANSACTION;
    SET time_zone = "+00:00";
    
    
    /*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
    /*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
    /*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
    /*!40101 SET NAMES utf8mb4 */;
    
    --
    -- Database: `sambajiu_samba`
    --
    
    -- --------------------------------------------------------
    
    CREATE TABLE `bookings` (
      `id` int(11) NOT NULL,
      `fname` varchar(100) NOT NULL,
      `surname` varchar(100) DEFAULT NULL,
      `email` varchar(255) NOT NULL,
      `telephone` varchar(100) NOT NULL,
      `age_group` varchar(100) DEFAULT NULL,
      `hear` varchar(100) DEFAULT NULL,
      `experience` text,
      `subscriber` tinyint(1) DEFAULT NULL,
      `booking_date` varchar(255) DEFAULT NULL,
      `lesson_time` varchar(255) NOT NULL,
      `booked_on` datetime DEFAULT CURRENT_TIMESTAMP
    ) ENGINE=MyISAM DEFAULT CHARSET=utf8;

    ALTER TABLE `bookings` ADD PRIMARY KEY (`id`);

    ALTER TABLE `bookings` MODIFY `id` int(11) NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=345;
admec986
  • 26
  • 4

1 Answers1

0

Czech is not handled by latin1. It would be better to use utf8mb4 (which can handle virtually everything in the world). Outside of MySQL, it is called "UTF-8".

How did you do the "export" and "import"? What is in the file? Can you get the hex of a small portion of the exported file -- we need to check what encoding was used for the Czech characters.

As for "as question mark or weird symbols", see question marks and Mojibake in Trouble with UTF-8 characters; what I see is not what I stored .

Your hex probably intended to say

Rezervovat trénink zda

In the middle of the hex is

C383 C2A9

Which is UTF-8 for é. When you display the data, you might see that, or you might see the desired é. In the latter case, the browser is probably "helping" you by decoding the data twice. For further discussion on this, see "double encoding" in the link above.

"Fixing the data" is quite messy:

 CONVERT(BINARY(CONVERT(CONVERT(
 UNHEX('52657A6572766F766174207472C383C2A96E696E6B207A6461') 
   USING utf8mb4) USING latin1)) USING utf8mb4)

==> 'Rezervovat trénink zda'

But, I don't think we are finished. that acute-e is a valid character in latin1. You mentioned 4 Czech accented letters that, I think, are not in Latin1. Latin5 and dec8 may be relevant.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • I did it using phpmyadmin. I've added a sample of the database to my question – admec986 Oct 17 '22 at 08:32
  • @admec986 - Was `DEFAULT CHARSET=utf8;` on the table both before and after? Were there any parameters on the dump and load? What was/is the default charset for the connection to the database before/after? – Rick James Oct 17 '22 at 17:25
  • No when I imported it it was CHARSET=latin1 but since this didn't allow be to write the special characters in the form I changed it to utf8. However this didn't solve the problem of all the original data being completely messed up. Both of the phpmyadmin I imported it from and exported to has the Server connection collation : utf8mb4_unicode_ci . I don't think there was any parameters I just clicked export or the import button – admec986 Oct 17 '22 at 20:41
  • With `SELECT col, HEX(col) FROM ... WHERE ...` for a small amount of data, I can help you determine how much damage there is. `HEX()` shows what is actually on the disk; all the "charset" clues are less definitive. – Rick James Oct 17 '22 at 20:54
  • Ok so I ran this for a table named home_section_five: SELECT s5_btn1, HEX(s5_btn1) FROM home_section_five WHERE id=1;;; and it returns 52657A6572766F766174207472C383C2A96E696E6B207A6461... – admec986 Oct 17 '22 at 21:51