0

Need a query that finds rows with a column string value that contains any character other than alphanumeric + specific character.

For example:

colName
_______
abc123"
acb231!
bca412++
.4!21
abc
123

I want to select all rows that contain any characters other than

  • digits 0-9
  • letters a-z (lower or upper case),
  • space
  • backtick
  • Any of the following additional characters !$%[_-()

Resulting in:

abc123"
.4!21
Martin Smith
  • 438,706
  • 87
  • 741
  • 845
fwippy
  • 33
  • 8
  • What have you tried so far – Peter Smith Mar 16 '23 at 19:43
  • WHERE colName NOT LIKE '%[^0-9A-Z!+]%' Lots variants of it too. I'm fairly new to using regex, so I've been playing around with it for a while. – fwippy Mar 16 '23 at 19:46
  • 1
    `NOT LIKE` should be `LIKE` as you are looking for rows that match the pattern https://dbfiddle.uk/LonR70YZ – Martin Smith Mar 16 '23 at 19:47
  • Does this answer your question? [SQL query for finding rows with special characters only](https://stackoverflow.com/questions/2968042/sql-query-for-finding-rows-with-special-characters-only) – PM 77-1 Mar 16 '23 at 19:49
  • Hm, I think the actual issue is that characters I'm inversely selecting for need to be escaped maybe? I also want to include the characters !$%`-_ to be okay. So, a string containing * would still be selected (e.g., *123!) – fwippy Mar 16 '23 at 19:51
  • 1
    No they don't need to be escaped - the main issue is that you are using `NOT LIKE` and should be using `LIKE`. A second issue is that `-` is one of the additional characters and so make sure this is the end character of the set so it isn't interpreted as meaning a range – Martin Smith Mar 16 '23 at 19:54
  • Sorry, I should've clarified: yes, I tried it with LIKE instead and that worked for the alphanumeric characters. Can you clarify on what you mean by making sure "-" is the end character of the set? As in place it literally at the end of the expression? – fwippy Mar 16 '23 at 19:59
  • Yes. If it is surrounded by other characters SQL Server interprets that as you having a mixed set of specific characters and a range of characters – Martin Smith Mar 16 '23 at 20:01
  • I see, thank you! If I included square brackets as whitelisted characters as well, would I have to escape them then? Looking like this so far: LIKE '%[^0-9a-zA-z \[!$%`_()-]%' ESCAPE '\' – fwippy Mar 16 '23 at 20:08
  • 1
    Not sure. Easiest thing to do is first get the positive version of your expression working. Like this https://dbfiddle.uk/EXuqNSXd and then when that is returning exactly the characters you need then add the `^` onto it to get the negated version – Martin Smith Mar 16 '23 at 20:16
  • 1
    you had the casing wrong in your previous comment (`A-z` needs to be `A-Z`) https://dbfiddle.uk/SE9gg9ET – Martin Smith Mar 16 '23 at 20:24

2 Answers2

1

For the full requirement described in the comments this was trickier than I thought it would be

but

LIKE '%[^- !$%`_()\]0-9a-zA-Z]%' COLLATE Latin1_General_100_BIN2 ESCAPE '\' 

does the job.

The ] is escaped with \ and the - is put at the beginning of the set so it is not interpreted as being in the role of a range (as later in the pattern)

To validate this is correct I find it simplest to check the positive version against all characters first (DB Fiddle).

SELECT CharCode = value, 
       Character = NCHAR(value)
FROM GENERATE_SERIES(0,65535)
WHERE NCHAR(value) LIKE '%[- !$%`_()\]0-9a-zA-Z]%' COLLATE Latin1_General_100_BIN2 ESCAPE '\' 
Martin Smith
  • 438,706
  • 87
  • 741
  • 845
0

In SQL Server, you can use the like operator to match the pattern you are looking for.

The pattern is [^a-z0-9!+]

Here is the query:

select
   colName 
from
   input_table
where
   colName like '%[^a-z0-9!+]%'
colName
abc123"
.4!21
Dale K
  • 25,246
  • 15
  • 42
  • 71
Junjie
  • 1,170
  • 1
  • 2
  • 6