2

I have an SQL function for converting a string to a base64-encoded string. The only problem is is that UTF8 is not being used, which I can tell because I have a seperate C# tool that gives a different output.

SQL Function:

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

ALTER FUNCTION [dbo].[Base64Encode] ( @sInput NVARCHAR(max) )
RETURNS NVARCHAR(max)
BEGIN

DECLARE @vInput              VARBINARY(max)
DECLARE @sEncodedOutput      NVARCHAR(max)

set @vInput = convert(varbinary(max), @sInput)
set @sEncodedOutput = cast('' as xml).value('xs:base64Binary(sql:variable("@vInput"))', 'NVARCHAR(max)')

RETURN @sEncodedOutput

END

C#

        try
        {
            encodedValueTextbox.Text = Convert.ToBase64String(
                Encoding.UTF8.GetBytes(valueTextbox.Text));
        }
        catch (Exception ex)
        {
            encodedValueTextbox.Text = ex.Message;
        }

Is there a way to get SQL to use UTF8?

LynchDev
  • 793
  • 1
  • 12
  • 27
  • 2
    While I can't anwer your question, I am wondering why you want to base64 encode a UFT-8 string? Do you try to hide the text? If you want so, better go with encryption. – GameScripting Jan 14 '13 at 13:55
  • The 'random text' look that base64 gives is sufficient – LynchDev Jan 14 '13 at 13:59
  • @GameScripting: You can't base-64 encode an nvarchar value as there are more than 256 character codes, so the text has to be encoded into byte format, for example using UTF-8. – Guffa Jan 14 '13 at 14:00
  • A MSSQL `NVarChar` is not a .Net `string`. A MSSQL `VarBinary` is not a .Net `byte[]`. A TSQL conversion to from `NVarChar` to `VarBinary` is not a call to the .Net `UTF8.GetBytes()` function. A TSQL cast to an XML `NVarChar` is not a call to the .Net `Convert.ToBase64String()`. So, what you makes you think the operations should be equivalent? Do you want to round trip the values using functions from both platforms? – Jodrell Jan 14 '13 at 14:04
  • 1
    @Jodrell: I am not sure what your point is? Yes I understand the datatypes differ between languages, but as I am not as proficient in SQL as I am C#, I am asking how can I get SQL to produce the same output as C#. – LynchDev Jan 14 '13 at 14:15
  • I think if what I am reading is correct; SQL Server doesn't support UTF8? I was hoping to be able to do something like `convert(varbinary(max), @sInput, UTF8)` or something... – LynchDev Jan 14 '13 at 14:30
  • LynchDev, NVARCHAR is UTF16. You can do this but you need to write your own UTF-8 encoding `create function dbo.NVarCharToUTF8(@s nvarchar(max)) returns varbinary(max) as....` – Ben Jan 14 '13 at 14:48
  • My point is. If the TSQL approach round trips, does it matter if the results differ from some other method? – Jodrell Jan 14 '13 at 15:24
  • @Jodrell has a point. Why not change the C# method to use UTF-16 to match the T-SQL method? Probably easier. – Ben Jan 14 '13 at 15:57
  • Still not understanding you @Jodrell; if you're asking if it matters that both methods produce the same output, then yes it does (thus the question!). @Ben I was hoping to not have to do that as it will require taking down the system while we spend time re-encoding our data. As for a NVarCharToUTF8 function - I wouldn't know where to start. – LynchDev Jan 14 '13 at 16:17
  • You can do it with a SQL CLR function. http://msdn.microsoft.com/en-us/library/ms189876.aspx – Jodrell Jan 14 '13 at 16:37
  • See this answer for an example of converting a string to UTF-8: https://stackoverflow.com/a/14041069/1187211 – Razvan Socol Dec 25 '17 at 18:12

1 Answers1

0

Your @sInput is NVARCHAR, which is UNICODE UCS-2. So, you're turning UCS-2 into binary.

You could change the input parameter to varbinary and pass in UTF-8 as binary into the function. If your source data into the function is also nvarchar from SQL, you'll need to keep pushing that binary data conversion back to the source C# (maybe to the point of storage as varbinary, basically I don't know what the rest of your system looks like).

An alternative is to write this function in CLR. Convert the UCS-2 into UTF-8 using C# and then just do the Base64 encoding there as well.

WebMasterP
  • 518
  • 4
  • 8