0

I want to search user input in my database. database collation is latin1_swedish_ci. I don't want to change that, instead can I change user input utf-8 to latin1_swedish_ci?

Edit:
I approach two methods.
Method 1: I imported and used default collation latin1_swedish_ci and character set latin1. Then I have
image

Here I can query like SELECT * FROM dict WHERE english_word = '$_value' and I get all the values of column including malayalam_definition in the browser as desired. But problem is I can't query like SELECT * FROM dict WHERE malayalam_definition = '$_value'. It returns no result.

Method 2: I changed collation to utf8_unicode_ci and character set to utf8. Then in mysql I get desired values like
image
Here I when I query like SELECT * FROM dict WHERE english_word = '$_value' in browser I get question marks in malayalam_definition values like
image

Result of SHOW VARIABLES LIKE 'character\_set\_%';

+--------------------------+--------+
| Variable_name            | Value  |
+--------------------------+--------+
| character_set_client     | utf8   |
| character_set_connection | utf8   |
| character_set_database   | utf8   |
| character_set_filesystem | binary |
| character_set_results    | utf8   |
| character_set_server     | latin1 |
| character_set_system     | utf8   |
+--------------------------+--------+
7 rows in set (0.00 sec)

Do I need to change character_set_server, then how to do it?

Hfyuu
  • 199
  • 1
  • 3
  • 11
  • "instead can I change user input utf-8 to latin1_swedish_ci" No you shouldn't it's generally not a great idea to convert `utf8` to `latin1` simply because `utf8` does support more then `latin1` does so you can get corrupted data. – Raymond Nijland Feb 07 '19 at 15:16
  • letme try, Is it possible? – Hfyuu Feb 07 '19 at 15:21
  • `character_set_server = latin1` is not the villain. See the Edit to my Answer. How did you do the "import"? `LOAD DATA` has a charset clause in it; what did you say? `mysqldump` sets the charset in the file; what did it say? – Rick James Feb 08 '19 at 05:58
  • I used this command to load data `load data local infile '/mnt/c/Users/justi/Desktop/enml/file.csv' into table dict fields terminated by '\t' IGNORE 1 LINES;` – Hfyuu Feb 08 '19 at 08:12
  • `SET NAMES utf8` Worked for me, now showing correct character in the browser and phpmyadmin. Still I can't search for the malayalam_definition column like `SELECT * FROM dict WHERE malayalam_definition = '$_value'` giving empty result – Hfyuu Feb 08 '19 at 08:59

1 Answers1

0

First of all, the "database collation" is only a default. The real question is what is the CHARACTER SET of the columns that you are interested in.

Then, what are the bytes in your client? Are they encoded as latin1? Or utf8? In either case, tell MySQL that that is what is coming at it. This is preferably done in the connection parameters. (What is your client language?) Alternatively, use SET NAMES latin1 or SET NAMES utf8, according to the client encoding.

Now, what MySQL will do on INSERT and SELECT... It will convert the encoding from the client's encoding to the column's encoding as you do an INSERT. No further action is needed to achieve this.

Similarly, MySQL will convert the other way during a SELECT.

(Of course, if the column and the client are talking the same encoding, no "convert" is needed.)

Your question mentions "collation". So far, I have only talked about CHARACTER SETs, also known as "encoding". Contrast with that, the sorting and comparing of two strings -- this is COLLATION.

For the CHARACTER SET latin1, the default COLLATION is latin1_swedish_ci.
For the CHARACTER SET utf8, the default COLLATION is utf8_general_ci.
There are several different "collations" to handle the quirks of German or Turkish or Spanish or (etc) orderings.

Please explain why you are trying to do what you stated. There are many ways you can do it wrong, so I do not want to give you an ALTER statement -- it may just make things worse for the real goal.

It is better to use utf8mb4 instead of utf8. The outside world refers to UTF-8; this is equivalent to MySQL's utf8mb4.

Edit (after OP's Edit)

The first screenshot shows "Mojibake". Another screenshot shows question marks. The causes of each are covered in Trouble with UTF-8 characters; what I see is not what I stored

Rick James
  • 135,179
  • 13
  • 127
  • 222