1

I'm runing query like this:

SELECT CAST(CAST(IDAT as varbinary(max)) as varchar(max)) 

IDAT column type is the "image", query executing successful but some Turkish characters are 'çaÄŸdaÅŸ' like this. it should be 'çağdaş'. I tried this :

SELECT CAST(CAST(IDAT as varbinary(max)) as varchar(max)) COLLATE Turkish_CI_AS

Nothing changed. Does anyone have an idea for this situation?

alexkidd
  • 68
  • 1
  • 9
  • 'çaÄŸdaÅŸ' is 9 symbols, while 'çağdaş' is only 6. What is the size of the data in IDAT? Post is as hex value. Do you know the encoding used for the data stored there? – Andrey Nikolov Nov 30 '18 at 07:12
  • Value before casting varchar : 0x68C3BC736579696E20616C692076656C69206E6F746C617220C59F65726D696E2061C49F72C4B120C3B66BC3BC7A20C3A761C49F6461C59F i dont know encoding using for this data. – alexkidd Nov 30 '18 at 07:35
  • SQL Server already supports Unicode. *Don't* store text as binary, you don't need it – Panagiotis Kanavos Nov 30 '18 at 07:51
  • The Unicode types are `nchar` and `nvarchar`. Don't use `varchar` unless you can *ensure* the text matches the collation. – Panagiotis Kanavos Nov 30 '18 at 07:58

1 Answers1

4

This binary value 0x68C3BC736579696E20616C692076656C69206E6F746C617220C59F65726D696E2061C49F72C4B120C3B66BC3BC7A20C3A761C49F6461C59F is UTF-8 encoded, which converted to text is hüseyin ali veli notlar şermin ağrı öküz çağdaş. Unfortunately, SQL Server doesn't support UTF-8 until 2019, so you must decode it yourself. This question explains how to do it.

If you can change the application and store the value in UTF-16 you will be able to cast it to NVARCHAR directly (or just store the data as text).

Andrey Nikolov
  • 12,967
  • 3
  • 20
  • 32
  • 1
    The answer to that question is really bad. A simple SQLCLR UDF that with just a single call `Encoding.UTF8.GetString(inputBytes.Value)` would be faster and far more likely to be correct. [This question](https://stackoverflow.com/questions/840552/clr-udf-returning-varbinarymax) shows the reverse, generating UTF8 bytes from a string. – Panagiotis Kanavos Nov 30 '18 at 07:55
  • Yes, if CLR is an option, it will be much better option. But still, changes in the application would be better. If this text must be available as UTF-8 in the database (why?!) and converted to text in the database (again, why?), then the application could simply store it twice :) – Andrey Nikolov Nov 30 '18 at 08:05
  • Actually, My only interest is converting this data type to the right string. in this case, @AndreyNikolov solution is successful worked. – alexkidd Nov 30 '18 at 08:15