1

This issue is absolutely killing me, I cannot get a single solution online to work!

I am trying to import the following text from a CSV to a MySQL table - via Navicat's import wizard:

L154 – TRAINING WARRANTY

The hyphen is a wide hyphen and so far I've managed to import it as either a question mark, or a black diamond with question mark inside. Same for £ symbols and other special characters.

enter image description here

Everyone always talks about UTF-8. So far I have tried:

Saving the CSV in Excel, clicking Tools > Web Options > Encoding: UTF-8

Right clicking the database and clicking EDIT. Setting Char set to utf8 Unicode and Collation: utf8_general_ci

I have "designed" the table and set the 2 options above to exactly the same.

I have edited the varchar field in question and set the same 2 fields again to the 2 types above.

But my hyphen will not import correctly.

It would be nice to know exactly how to go about importing data that has £ symbols and other special characters once and for all.

James Wilson
  • 809
  • 3
  • 14
  • 25
  • I use "latin1_swedish_ci" for the table and the varchar field and just tried to save "£" and it worked for me. – Ofir Baruch Nov 06 '17 at 12:09
  • That's not a "hyphen", it is an "en dash". The hex (in UTF-8) is 3 bytes: `E28093`. If you can get the hex of the csv file, please confirm. Meanwhile, don't use `latin1` for the table/column, use `utf8` (or utf8mb4). – Rick James Nov 06 '17 at 20:02

1 Answers1

1

You can use htmlentities() to store these symbols in encoded form like e.g.

£ will be £ and will be –, by using:

echo htmlentities('£');

and when retrieving, just use html_entity_decode() Like:

echo html_entity_decode('£'); // output £

Edit: As discussed in the comments, how you are trying to import the data from a CSV file. You have to change the encoding for the CSV file, which can be done using Notepad++, By going to Encoding->Encode in UTF-8

mega6382
  • 9,211
  • 17
  • 48
  • 69
  • Hi Mega. I'm not using PHP until the data is being output on the page. It's being stored via an import so htmlentities doesn't come into play here. – James Wilson Nov 06 '17 at 14:54
  • I have set Char Set to UTF8 in the table options - and every text/varchar field in that table to utf8 and I still get the diamonds. This seems an impossible situation! – James Wilson Nov 06 '17 at 15:05
  • Just to add to this - this query works and the hyphen inserts correctly. So it seems to be struggling to deal with the CSV: insert into table_name SET Name = 'L154 – TRAINING WARRANTY' – James Wilson Nov 06 '17 at 16:11
  • It worked!!!! Can't believe it. I can live with re-saving the imports each time. Thank you so much. Can you add that as a new answer and I will vote? – James Wilson Nov 06 '17 at 16:19
  • Just running some tests - it seems I need to export the data as XLSX if I want to edit the data back in Excel. If I export as CSV the data looks like this: L154 – TRAINING WARRANTY - is that what you would expect? – James Wilson Nov 06 '17 at 16:24
  • @JamesWilson Yes, that looks like `ANSI` to me, you can easily convert that to UTF-8 too, via the Npp – mega6382 Nov 06 '17 at 16:25
  • 1
    I exported to to XLSX. Opened in Excel, edit the file, saved as CSV. Opened the CSV in NotePad++, converted to UTF-8 - imported successfully to DB. I'm happy with that procedure! – James Wilson Nov 06 '17 at 16:45