Inserting a ≤ character gets converted into = sign

Question

We have a SQL server running on Azure RDS, We have a problem where a stored procedure is doing some ETL work parsing JSON value, ultimately I tracked down the problem to this minimal reproducible scenario:

CREATE TABLE [dbo].[conditions_test] (
    [condition_notation] nvarchar(max),
    [condition_id] nvarchar(max)
);

insert into conditions_test (condition_id, condition_notation) values('25271','≤');

The problem is that this character, gets converted into =and results in:

condition_notation	condition_id
=	25271

I've tried changing the schema collation using alter table column collate Latin1_General_100_CI_AI_SC and no change.

The source of the JSON is actually from another DB running MySQL and the CHARSET function on the original JSON shows utf8mb4 - this data is copied over the the MS SQL DB for further ETL work.

Why is this happening and how to fix it?

The Unicode character will be preserved if you use `values(N'25271',N'≤')` — Stu, Apr 18 '23 at 18:41
Even [simpler repro](https://dbfiddle.uk/TU__vVKh): `DECLARE @x nvarchar(max) = '≤'; SELECT @x;` — Aaron Bertrand, Apr 18 '23 at 19:31
This isn't anything to do with JSON, it's how you've written your T-SQL code. `'≤'` is a character literal interpreted according to the current database's default collation. `N'≤'` is a National character literal and is treated as a Unicode code point. Whenever you deal with the `nchar` or `nvarchar` data types, or the long-deprecated `ntext` data type, you need to use National character literals with the `N` prefix or risk character loss or corruption as you've experienced here. — AlwaysLearning, Apr 18 '23 at 21:54

Inserting a ≤ character gets converted into = sign

0 Answers0