0

I found a lot of techniques to detect aka special chars($%@#) avail on English keyboard, however I see that some national char like á works differently, though my LIKE condition should get it, as it should select anything but a-z1-9, what is the trick here: In sample below I'm missing my special á. I'm on TSQL 2016 with default settings in US.

    ;WITH cte AS (SELECT 'Euro a€'  St UNION  SELECT 'adgkjb$' St UNION SELECT 'Bravo Endá' St)
    SELECT * FROM cte WHERE St LIKE '%[^a-zA-Z0-9 ]%'

        St
        adgkjb$
        Euro a€
   SELECT CAST(N'€' AS VARBINARY(8))    --0xAC20
   SELECT CAST(N'á' AS VARBINARY(8))    --0xE100
Mike S
  • 296
  • 2
  • 14

1 Answers1

1

SQL Server appears to be helping with ranges of characters due to the default collation. If you explicitly list all of the valid characters it will work as desired. Alternatively, you can force a collation on the pattern match that won't interpret the pattern a containing non-ASCII characters.

-- Explicit pattern for "bad" characters.
declare @InvalidCharactersPattern as VarChar(100) = '%[^abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789 ]%';

-- Query some sample data with the explicit pattern and an explicitly specified collation.
select Sample,
  case when Sample like @InvalidCharactersPattern then 'Bad Character' else 'Okay' end as ExplicitStatus,
  case when Sample like '%[^a-zA-Z0-9 ]%' collate Latin1_General_100_BIN
    then 'Bad Character' else 'Okay' end as CollationStatus
  from ( values ( 'a' ), ( 'A' ), ( 'á' ), ( 'Foo' ), ( 'F&o' ), ( '$%^&' ) ) as Samples( Sample );

-- Server collation.
select ServerProperty( 'collation' ) as ServerCollation;

-- Available collations.
select name, description
  from sys.fn_helpcollations()
  order by name;
HABO
  • 15,314
  • 5
  • 39
  • 57
  • Tx HABO, so that one missed could be in range? let say in A thru Z? do they go by Hex values on background? – Mike S Mar 29 '18 at 18:41
  • @MikeS The answer has been updated to include information about collations. There are likely many choices for collation that don't contain accented characters. [This](https://learn.microsoft.com/en-us/sql/relational-databases/collations/collation-and-unicode-support) is a good starting point. – HABO Mar 29 '18 at 19:39