As the pre-defined punctuation character class does not work for the end of the string, a self defined character class is used instead. Leaving out the dot on purpose. Adding in the single quote separately (as escaping it didn't work and finding the right character for the q operator might be hard to do in this case). Adding the closing square bracket on its own as Oracle doesn't seem to handle it correctly when escaped. Finally adding the trailing consecutive dots explicitly:
WITH T (id, col) AS (
SELECT 1, 'GEORGE & SON ' FROM DUAL UNION ALL
SELECT 2, '-GEORGE & SON' FROM DUAL UNION ALL
SELECT 3, '&GEORGE & SON' FROM DUAL UNION ALL
SELECT 4, 'GEORGE & SON..' FROM DUAL UNION ALL
SELECT 5, 'GEORGE & SON.' FROM DUAL UNION ALL
SELECT 6, '-GEORGE & SON S.A.' FROM DUAL UNION ALL
SELECT 7, 'GEORGE & SON!' FROM DUAL UNION ALL
SELECT 8, 'GEORGE & SON"' FROM DUAL UNION ALL
SELECT 9, 'GEORGE & SON#' FROM DUAL UNION ALL
SELECT 10, 'GEORGE & SON$' FROM DUAL UNION ALL
SELECT 11, 'GEORGE & SON%' FROM DUAL UNION ALL
SELECT 12, 'GEORGE & SON&' FROM DUAL UNION ALL
SELECT 13, 'GEORGE & SON(' FROM DUAL UNION ALL
SELECT 14, 'GEORGE & SON)' FROM DUAL UNION ALL
SELECT 15, 'GEORGE & SON*' FROM DUAL UNION ALL
SELECT 16, 'GEORGE & SON+' FROM DUAL UNION ALL
SELECT 17, 'GEORGE & SON,' FROM DUAL UNION ALL
SELECT 18, 'GEORGE & SON\' FROM DUAL UNION ALL
SELECT 19, 'GEORGE & SON-' FROM DUAL UNION ALL
SELECT 20, 'GEORGE & SON\' FROM DUAL UNION ALL
SELECT 21, 'GEORGE & SON/' FROM DUAL UNION ALL
SELECT 22, 'GEORGE & SON:' FROM DUAL UNION ALL
SELECT 23, 'GEORGE & SON;' FROM DUAL UNION ALL
SELECT 24, 'GEORGE & SON<' FROM DUAL UNION ALL
SELECT 25, 'GEORGE & SON=' FROM DUAL UNION ALL
SELECT 26, 'GEORGE & SON>' FROM DUAL UNION ALL
SELECT 27, 'GEORGE & SON?' FROM DUAL UNION ALL
SELECT 28, 'GEORGE & SON@' FROM DUAL UNION ALL
SELECT 29, 'GEORGE & SON[' FROM DUAL UNION ALL
SELECT 30, 'GEORGE & SON^' FROM DUAL UNION ALL
SELECT 31, 'GEORGE & SON_' FROM DUAL UNION ALL
SELECT 32, 'GEORGE & SON`' FROM DUAL UNION ALL
SELECT 33, 'GEORGE & SON{' FROM DUAL UNION ALL
SELECT 34, 'GEORGE & SON|' FROM DUAL UNION ALL
SELECT 35, 'GEORGE & SON}' FROM DUAL UNION ALL
SELECT 36, 'GEORGE & SON~' FROM DUAL UNION ALL
SELECT 37, 'GEORGE & SON''' FROM DUAL UNION ALL
SELECT 38, 'GEORGE & SON]' FROM DUAL)
SELECT
* FROM T
WHERE REGEXP_LIKE(col, '(^[[:punct:]]|[-!"#$%&()*+,\/:;<=>?@[^_`{|}~' || '''' || ']$)|]$|\.\.$|(^[[:space:]]|[[:space:]]$)')
ORDER BY id
;
Updated Requirements
Punctuation followed by a single dot
Add an optional dot to the set of special characters; going from
'[-!"#$%&()*+,\/:;<=>?@[^_`{|}~' || '''' || ']$'
to
'[-!"#$%&()*+,\/:;<=>?@[^_`{|}~' || '''' || ']\.?$'
as in
WITH T (id, col) AS (
SELECT 40, 'GEORGE & SON^.'FROM DUAL UNION ALL
SELECT 41, 'GEORGE & SON_.'FROM DUAL UNION ALL
SELECT 42, 'GEORGE & SON`.'FROM DUAL UNION ALL
SELECT 43, 'GEORGE & SON{.'FROM DUAL UNION ALL
SELECT 44, 'GEORGE & SON|.'FROM DUAL UNION ALL
SELECT 45, 'GEORGE & SON}.'FROM DUAL UNION ALL
SELECT 46, 'GEORGE & SON~.'FROM DUAL UNION ALL
SELECT 47, 'GEORGE & SON''.'FROM DUAL UNION ALL
SELECT 48, 'GEORGE & SON].'FROM DUAL)
SELECT
* FROM T
WHERE REGEXP_LIKE(col, '([-!"#$%&()*+,\/:;<=>?@[^_`{|}~' || '''' || ']\.?$)|]\.?$')
ORDER BY id
;
Repetition of (combinations of) blanks and special characters within the string
Originally, just leading and trailing occurrences were asked for… ;-)
Sequences of two or more space/punctuation characters are caught by
[[:space:][:punct:]]{2,}
If you want this explicitely inside strings, only - surround them with word characters:
\w[[:space:][:punct:]]{2,}\w
Leading/trailing consecutive spaces are already matched when are single one is found - no need for worry about them explicitely.
Which gives:
WITH T (id, col) AS (
SELECT 50, 'GEORGE & SON ' FROM DUAL UNION ALL
SELECT 51, 'GEORGE & SON ' FROM DUAL UNION ALL
SELECT 52, ' GEORGE & SON' FROM DUAL UNION ALL
SELECT 53, ' GEORGE & SON' FROM DUAL UNION ALL
SELECT 54, 'GEORGE & SON' FROM DUAL UNION ALL
SELECT 55, 'GEORGE & SON S.A.' FROM DUAL UNION ALL
SELECT 56, 'GEORGE & SON S.A.' FROM DUAL UNION ALL
SELECT 60, ' GEORGE and SON' FROM DUAL UNION ALL
SELECT 61, ' ,GEORGE and SON' FROM DUAL UNION ALL
SELECT 62, ', GEORGE and SON' FROM DUAL UNION ALL
SELECT 63, 'GEORGE -- SON' FROM DUAL UNION ALL
SELECT 64, 'GEORGE --SON' FROM DUAL UNION ALL
SELECT 65, 'GEORGE & SON' FROM DUAL UNION ALL
SELECT 66, 'GEORGE + SON' FROM DUAL UNION ALL
SELECT 67, 'GEORGE and , SON' FROM DUAL UNION ALL
SELECT 68, 'GEORGE and , SON' FROM DUAL UNION ALL
SELECT 69, 'GEORGE and SON ,' FROM DUAL UNION ALL
SELECT 70, 'GEORGE and SON. ' FROM DUAL UNION ALL
SELECT 71, 'GEORGE and+-SON' FROM DUAL)
SELECT
* FROM T
-- WHERE REGEXP_LIKE(col, '(^[[:punct:]]|[-!"#$%&()*+,\/:;<=>?@[^_`{|}~' || '''' || ']\.?$)|]$|\.\.$|(^[[:space:]]|[[:space:]]$)|[[:space:][:punct:]]{2,}')
WHERE REGEXP_LIKE(col, '(^[[:punct:]]|[-!"#$%&()*+,\/:;<=>?@[^_`{|}~' || '''' || ']\.?$)|]$|\.\.$|(^[[:space:]]|[[:space:]]$)|\w[[:space:][:punct:]]{2,}\w')
ORDER BY id
;
But this will produce false positives, most prominently GEORGE & SON. This could, to some degree, be avoided by replacing [:punct:] with a less inclusive set. The (final) choice will depend on whether false negatives or false positives are more of a concern.
See it in action:
Catch arbitrary sequences of punctuation and space characters - but allow for single letters followed by a single dot and a single blank
As mentioned before, false positives need to balanced against false negatives. One way or the other.
However, this is probably a good time to consider breaking apart the overall problem into smaller ones - and handle them individually. Even if GEORGE and P. SON is perfectly acceptable, you probably want to review, e.g., -GEORGE and P. SON. So let's focus on sequences of stray characters in the middle of the string - even remembering the ** & ** from before, and allowing enumerations (and thus commas):
WHERE
REGEXP_LIKE(col, '\w[[:space:][:punct:]]{2,}\w')
AND
NOT REGEXP_LIKE(col, ' [[:upper:]]\. \w')
AND
NOT INSTR(col, ', ') > 0
AND
NOT INSTR(col, ' & ') > 0
Probably followed by
WHERE
REGEXP_LIKE(col, '\w[[:space:][:punct:]]{2,}\w')
AND
(REGEXP_LIKE(col, ' [[:upper:]]\. \w')
OR
INSTR(col, ', ') > 0
OR
INSTR(col, ' & ') > 0
)
in order to find, e.g., GEORGE and , SON between many valid ones. INSTR
could be faster than REGEX - depending on overall circumstances…
A few more words on the mechanics
(i) [[:punct:][:space:]] essentially combines [[:punct:]] and [[:space:]] into a single character class. As far as selection from that class is concerned, the order is of no relevance.
(ii)
[-!"#$%&()*+,\/:;<=>?@[^_`{|}~' || '''' || ']
is
[-!"#$%&()*+,\/:;<=>?@[^_`{|}~]
with a single quote added. If this were tried directly, Oracle would consider the single quote to end the parameter value. And escaping the single quote with a backslash didn't work… So basically, this is what has been called above "Adding in the single quote separately".
Please comment, if and as this requires adjustment / further detail.