0

Hi i am new to SQL and am writing a case statement for a column of grade values. The values can be a length of 3 like A02, B04, A10, A09, D03. The first character is a letter while the next 2 are digits.

If a user enters in 'A02 I want to change it to do A02. Basically remove any special characters if there are present.

CASE
WHEN Grade like '[^0-9A-z]%' THEN ''
else Grade end as Grade

So far I have this but I am not sure how to use regex to remove the character only search for it.

jose_bacoy
  • 12,227
  • 1
  • 20
  • 38
  • 2
    Whatever is sending commands to the database should remove the special characters, not the database itself. The majority of languages (and/or libraries) that support SQL have built-in methods of doing this anyway. This would be unnecessarily complicated to do in the query itself. – Jesse Aug 17 '22 at 18:16
  • 1
    Which dbms are you using? ANSI SQL like has no regexp functionality. – jarlh Aug 17 '22 at 18:46
  • Delete non-alpha (not a letter and not a digit) characters if they appear in the first position only? – tinazmu Aug 17 '22 at 21:40

1 Answers1

0

Unless you really want to do a CASE for the fun of it, in oracle I'd do it like this which removes punctuation characters and spaces when you select it. Note this does not verify format so a grade of Z1234 would get returned.

WITH tbl(ID, grade) AS (
  SELECT 1, 'A01' FROM dual UNION ALL
  SELECT 1, '''B02' FROM dual UNION ALL
  SELECT 2, '$    C01&' FROM dual
)
SELECT ID, grade, REGEXP_REPLACE(grade, '([[:punct:]]| )') AS grade_scrubbed
from tbl;


        ID GRADE     GRADE_SCRUBBED
---------- --------- --------------
         1 A01       A01           
         1 'B02      B02           
         2 $    C01& C01           

3 rows selected.

HOWEVER, that said, since you seem to want to verify the format and use regex, you could do it this way although it's a little fugly. See comments.

WITH tbl(ID, grade) AS (
  -- Test data. Include every crazy combo you'd never expect to see,
  --   because you WILL see it, it's just a matter of time :-)
  SELECT 1, 'A01'               FROM dual UNION ALL
  SELECT 1, '''B02'             FROM dual UNION ALL
  SELECT 2, '$    C01&'         FROM dual UNION ALL
  SELECT 3, 'DDD'               FROM dual UNION ALL
  SELECT 4, 'A'||CHR(10)||'DEF' FROM dual UNION ALL
  SELECT 5, 'Z1234'             FROM dual UNION ALL
  SELECT 6, NULL                FROM dual
)
SELECT ID, grade, 
CASE 
  -- Correct format of A99.
  WHEN REGEXP_LIKE(grade, '^[A-Z]\d{2}$') 
    THEN grade
  -- if not A99, see if stripping out punctuation and spaces make it match A99.
  --   If so, return with punctuation and spaces stripped out.
  WHEN NOT REGEXP_LIKE(grade, '^[A-Z]\d{2}$') 
    AND REGEXP_LIKE(REGEXP_REPLACE(grade, '([[:punct:]]| )'), '^[A-Z]\d{2}$')
    THEN REGEXP_REPLACE(grade, '([[:punct:]]| )')
  -- if not A99, and stripping out punctuation and spaces didn't make it match A99,
  --   then the grade is in the wrong format.
  WHEN NOT REGEXP_LIKE(grade, '^[A-Z]\d{2}$') 
    AND NOT REGEXP_LIKE(REGEXP_REPLACE(grade, '([[:punct:]]| )'), '^[A-Z]\d{2}$')
    THEN 'Invalid grade format'
  -- Something fell through all cases we tested for.  Always expect the unexpected!
  ELSE 'No case matched!'
END AS grade_scrubbed
from tbl;


        ID GRADE                GRADE_SCRUBBED      
---------- -------------------- --------------------
         1 A01                  A01                 
         1 'B02                 B02                 
         2 $    C01&            C01                 
         3 DDD                  Invalid grade format
         4 A
DEF                            Invalid grade format
                                                    
                                                                                
         5 Z1234                Invalid grade format
         6                      No case matched!    

7 rows selected.
Gary_W
  • 9,933
  • 1
  • 22
  • 40