I'm trying to remove/replace certain unicode symbol characters from text in SQL Server 2019.
The server/DB I'm working with has collation Latin1_General_CI_AS.
I can't change the collation of the server/DB so i tried this (and most of the time it works, but some symbols don't work).
Environment:
SQL Server version:
Microsoft SQL Server 2019 (RTM-GDR) (KB4583458) - 15.0.2080.9 (X64) Nov 6 2020 16:50:01 Copyright (C) 2019 Microsoft Corporation Developer Edition (64-bit) on Windows 10 Pro 10.0 (Build 19041: )
Server Collation: Latin1_General_CI_AS
Database Collation: Latin1_General_CI_AS
This example works as expected:
select replace(N'☋a' collate Latin1_General_100_CI_AI_SC, N'☋' collate Latin1_General_100_CI_AI_SC, N'XXX')
select replace(N'a☋' collate Latin1_General_100_CI_AI_SC, N'☋' collate Latin1_General_100_CI_AI_SC, N'XXX')
Output:
XXXa
aXXX
This example only works if the unicode symbol (⚶) is the first character, but not if another character stands before it:
select replace(N'⚶a' collate Latin1_General_100_CI_AI_SC, N'⚶' collate Latin1_General_100_CI_AI_SC, N'XXX')
select replace(N'a⚶' collate Latin1_General_100_CI_AI_SC, N'⚶' collate Latin1_General_100_CI_AI_SC, N'XXX')
Output:
XXXa (correct)
a⚶ (wrong)
Does anybody have an idea why the replacement works with some characters/character orders, and with some it doesn't?