Recently I noticed that in my database some unicode characters (mostly Japanese) turned into unreadable garbage. For example this:
「オリジナル」 桜庭わかな - Far wandering~7つめのアレクトラ
turned into
「オリジナル〠桜åºã‚ã‹ãª - Far wandering~7ã¤ã‚ã®ã‚¢ãƒ¬ã‚¯ãƒˆãƒ©
The data is obtained from YouTube API (using PHP) then placed into my database. I suppose that Google may have broke something for a short time then fixed it. However now I am unable to obtain much data using Youtube API due to closed channels and/or removed videos.
I already tried:
- converting columns into latin1/blob/binary => utf8
- dumping few rows into file then opening sql file in Notepad++ and changing character encoding (In hope that charset was set wrong)
- asking for the oldest backup restore - it was already garbaged
Is there a way to restore it to the previous state?
I was told that mysql on my server is using xtradb instead of InnoDB. Below are information about my database and table
mysql> show variables like 'char%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)
mysql> show variables like 'collation%';
+----------------------+-----------------+
| Variable_name | Value |
+----------------------+-----------------+
| collation_connection | utf8_general_ci |
| collation_database | utf8_general_ci |
| collation_server | utf8_general_ci |
+----------------------+-----------------+
3 rows in set (0.00 sec)
mysql> SHOW FULL COLUMNS FROM my_table;
+--------------+---------------+-----------------+------+-----+-------------------+----------------+---------------------------------+---------+
| Field | Type | Collation | Null | Key | Default | Extra | Privileges | Comment |
+--------------+---------------+-----------------+------+-----+-------------------+----------------+---------------------------------+---------+
| id | int(11) | NULL | NO | PRI | NULL | auto_increment | select,insert,update,references | |
| timestamp | timestamp | NULL | NO | | CURRENT_TIMESTAMP | | select,insert,update,references | |
| etag | text | utf8_general_ci | YES | | NULL | | select,insert,update,references | |
| video_id | text | utf8_general_ci | YES | | NULL | | select,insert,update,references | |
| published | text | utf8_general_ci | YES | | NULL | | select,insert,update,references | |
| title | text | utf8_general_ci | YES | | NULL | | select,insert,update,references | |
| description | text | utf8_general_ci | YES | | NULL | | select,insert,update,references | |
| thumb | text | utf8_general_ci | YES | | NULL | | select,insert,update,references | |
| channel_id | text | utf8_general_ci | YES | | NULL | | select,insert,update,references | |
| channel_name | text | utf8_general_ci | YES | | NULL | | select,insert,update,references | |
| category | decimal(10,0) | NULL | YES | | NULL | | select,insert,update,references | |
| duration | text | utf8_general_ci | YES | | NULL | | select,insert,update,references | |
| definition | text | utf8_general_ci | YES | | NULL | | select,insert,update,references | |
+--------------+---------------+-----------------+------+-----+-------------------+----------------+---------------------------------+---------+
13 rows in set (0.00 sec)