So we originally had latin1 for our MySQL database (a long long time ago) and we are trying to convert to UTF8 before a more global outreach, but I'm having issues with the transition. Here's my MySQL:
/* First set as latin1 */
SET NAMES 'latin1';
/* We must change things to blob and then back again */
ALTER TABLE `address` CHANGE line_1 line_1 BLOB;
ALTER TABLE `address` CONVERT TO CHARACTER SET utf8;
ALTER TABLE `address` CHANGE line_1 line_1 VARCHAR(64);
And the error we are getting:
Incorrect string value: '\xF6gberg...' for column 'line_1' at row 7578
ALTER TABLE `address` CHANGE line_1 line_1 VARCHAR(64)
The method we are using is basically described through here: http://www.percona.com/blog/2013/10/16/utf8-data-on-latin1-tables-converting-to-utf8-without-downtime-or-double-encoding/ Any ideas would be great. (Also, since I'm not an expert in MySQL not sure what kind of data you would need, so lemme know if you need anything additional.)
Update
I've tried
SET NAMES 'utf8';
SET NAMES 'utf8mb4';
And I tried using utf8mb4 as was described below. After switching to utf8mb4 (which I'll likely keep), the alteration of of the address still produced the same problem.
Update 2 So I tried looking at converting the string itself to see what's happening and noticed something super weird:
mysql> select line_1 from address where line_1 like '%berg%';
+------------------------+
| line_1 |
+------------------------+
| H�bergsgatan 97 |
+------------------------+
mysql> select CONVERT(line_1 USING utf8) from address where line_1 like '%berg%';
+----------------------------+
| CONVERT(line_1 USING utf8) |
+----------------------------+
| NULL |
+----------------------------+
mysql> select CONVERT(line_1 USING utf8mb4) from address where line_1 like '%berg%';
+-------------------------------+
| CONVERT(line_1 USING utf8mb4) |
+-------------------------------+
| NULL |
+-------------------------------+
mysql> select CONVERT(line_1 USING latin1) from address where line_1 like '%berg%';
+------------------------------+
| CONVERT(line_1 USING latin1) |
+------------------------------+
| Högbergsgatan 97 |
+------------------------------+
So it seems like utf isn't the proper encoding for this? o_O As I'm working with addresses I was able to look it up and it seems like the address is in Stockholm and is supposed to be "Högbergsgatan 97" which matches latin1. I tried the swedish character encoding, but that seems to have failed as well:
mysql> select CONVERT(line_1 USING swe7) from address where addressid = 11065;
+----------------------------+
| CONVERT(line_1 USING swe7) |
+----------------------------+
| H?gbergsgatan 97 |
+----------------------------+
So I'm trying to see what I can do to rectify this.
Also, note that I had forgotten earlier to state that I'm using MySQL 5.6 (if that makes any difference)