5

I am trying to figure out how to use a patindex to find a range of letter characters, but exclude accented characters. If I do a straight search, using the default collate (insensitive) works just fine. However, when I search a range of letters, it will match on the accented character

SELECT
    IIF('Ú' = 'U' COLLATE Latin1_General_CI_AI, 'Match', 'No') AS MatchInsensitive,
    IIF('Ú' = 'U' COLLATE Latin1_General_CI_AS, 'Match', 'No') AS MatchSensitive,
    PATINDEX('%[A-Z]%', 'Ú' COLLATE Latin1_General_CI_AI)      AS PIInsensitive,
    PATINDEX('%[A-Z]%', 'Ú' COLLATE Latin1_General_CI_AS)      AS PISensitive

Will give the following results:

MatchInsensitive MatchSensitive PIInsensitive PISensitive
---------------- -------------- ------------- -----------
Match            No             1             1

What I am really trying to do is to identify the character position of accented characters in a string, so I was really searching for PATINDEX('%[^A-Z0-9 ]%').

If I have the following query, I would expect a result of 2 SELECT PATINDEX('%[^A-Z0-9 ]%', 'médico'), but I get 0.

RPh_Coder
  • 833
  • 8
  • 15

1 Answers1

6

You could use a binary collation, e.g. Latin1_General_100_BIN2.

select patindex('%[^a-zA-Z0-9 ]%', 'médico' collate Latin1_General_100_BIN2)

rextester: http://rextester.com/ZICLN98474

returns 2

SqlZim
  • 37,248
  • 6
  • 41
  • 59
  • Thank you. I cannot believe that I didn't think of that. I was just so surprised that accent sensitivity didn't work. – RPh_Coder Mar 30 '17 at 23:16