2

I have legacy system running MySQL 5.0.67. Columns collation is set to latin1_swedish_ci. PHP scripts that get data from DB and generates page with charset=windows-1257. In PHP page language specific characters are shown correctly. When I look directly into DB data I can see that these characters are not shown properly. I see Rûta instead of Rūta, Agnë instead of Agnė. I can't write to database or change its parameter.

When I run

SELECT a.Name COLLATE cp1257_general_ci FROM agent a

i get error

COLLATION 'cp1257_general_ci' is not valid for CHARACTER SET 'latin1'

How to get data from db with proper characters?

EDIT:

SHOW VARIABLES LIKE '%char%';
character_set_client    latin1
character_set_connection    latin1
character_set_database  cp1257
character_set_filesystem    binary
character_set_results   
character_set_server    latin1
character_set_system    utf8
character_sets_dir  /usr/local/share/mysql/charsets/

and

show variables like 'collation%';
collation_connection    latin1_swedish_ci
collation_database  cp1257_lithuanian_ci
collation_server    latin1_swedish_ci
Augustas
  • 1,167
  • 21
  • 31
  • Sounds more like an issue with your client than with the server or the data. And I doubt that the collation is your issue, more likely the character encoding of your client seems to be off. What client are you using? – arkascha May 29 '15 at 12:10
  • I'm using DbVisualizer 9 it has connection encotig set to cp1257. – Augustas May 29 '15 at 12:32
  • http://stackoverflow.com/questions/3513773/change-mysql-default-character-set-to-utf-8-in-my-cnf?rq=1 – mysqlrockstar May 29 '15 at 12:42
  • Interestingly CONVERT(CONVERT(ColumnName USING binary) USING cp1257) shows characters correctly. – Augustas Jun 26 '15 at 07:02

2 Answers2

0

Checkout if you are getting latin1 from

mysql> SHOW VARIABLES LIKE '%char%';

Then, In the mysql configuration (my.cnf) add the following lines, and restart mysql server

[mysqld]    
character-set-server = utf8
collation-server = utf8_general_ci
character-set-client-handshake = false

Finally check what you are getting from

mysql> show variables like 'collation%'; 

For reference :

http://dev.mysql.com/doc/refman/5.6/en/charset-literal.html

http://dev.mysql.com/doc/refman/5.6/en/charset-applications.html

mysqlrockstar
  • 2,536
  • 1
  • 19
  • 36
0

Plan A: Work entirely with cp1257:

SET NAMES cp1257;
declare columns to be `CHARACTER SET cp1257`

mysql> SET NAMES cp1257;
Query OK, 0 rows affected (0.00 sec)

mysql> SHOW VARIABLES LIKE 'char%';
+--------------------------+-----------------------------------------------+
| Variable_name            | Value                                         |
+--------------------------+-----------------------------------------------+
| character_set_client     | cp1257                                        |
| character_set_connection | cp1257                                        |
| character_set_database   | latin1                                        |
| character_set_filesystem | binary                                        |
| character_set_results    | cp1257                                        |
| character_set_server     | latin1                                        |
| character_set_system     | utf8                                          |

Note that 3 Variables are set.

Plan B: Use utf8 instead of cp1257.

The SET NAMES must match the encoding you have in the client. If ū is the single byte hex FB, the you must use cp1257. If it is the 2-byte hex C5AB, you must use utf8.

Independently, you can set the table columns to be either cp1257 or utf8. cp1257 allows for only Lithuanian and a few other languages. utf8 allows for all languages.

FB in latin1 renders as û, hence the incorrect string you got.

Rick James
  • 135,179
  • 13
  • 127
  • 222