0

I have a migration project PHP into .NET. In this application i have a problem faced is that type of encoding used in the PHP Code. In the database having values like

Thüringen, Baden-Württemberg, Oberösterreich

Which is encoded in the output of format both JSON, XML as

Th\u00fcringen, Baden-W\u00fcrttemberg, Ober\u00f6sterreich

Respectively. How could I do in .net. What type of Encoding is used in the PHP Code.

Sebastian Brosch
  • 42,106
  • 15
  • 72
  • 87
Sanjeev S
  • 626
  • 1
  • 8
  • 27

1 Answers1

2

In your database it is stored as UTF-8. You can see this due to the Unicode letter \u00fc (ü) whose UTF-8 representation is 0xC3 0xB6.

If you try to read 0xC3 0xB6 as ISO-8859-1, you will read the garbage ü from your first word Thüringen instead, proving that it's UTF-8 (and that whatever is reading your database is incorrectly thinking it is ISO-8859-1)

In your JSON/XML output, PHP has converted your code to Unicode code points, but I think this is irrelevant to your problem.

Assuming you are having trouble reading from the database with .net, make sure the encoding you are using is UTF-8 with the database, and it should read your data without any issue.

Martin Konecny
  • 57,827
  • 19
  • 139
  • 159
  • For PHP it uses MYSQL. for migration I entirely convert database MySQl to MSSQL with same structure. – Sanjeev S Dec 31 '15 at 05:51
  • 1
    Ok, you may have converted your database using the wrong encoding. Take a look to see what encoding you are using with http://stackoverflow.com/questions/7321159/determining-the-character-set-of-a-table-database - make sure your target MSSQL tables are in UTF-8 !!! – Martin Konecny Dec 31 '15 at 06:37