2

I am trying to use the oracle regular expression code in snowflake, but getting null value as result where as it is working as expected in snowflake.

Requirement: Compare post code and return result if it matches the format.

SELECT  
   CASE WHEN REGEXP_LIKE('AB101TZ','^[A-Z]{2}[0-9]+') THEN 'AB101TZ'
         WHEN REGEXP_LIKE('AB101TZ','^[A-Z][0-9]+') THEN 'AB101TZ'
         ELSE '-'
     END postcode_part_a

What modifications need to do this expression

Simeon Pilgrim
  • 22,906
  • 3
  • 32
  • 45

3 Answers3

3

So as Greg noted that REGEX functions automatically anchor, thus the ^ & $ tokens are not "needed" but if you want an open tail then you need to add a .*.

SELECT column1
    ,REGEXP_LIKE(column1,'[A-Z]{2}[0-9]+') as A
    ,REGEXP_LIKE(column1,'[A-Z][0-9]+') as B
    ,REGEXP_LIKE(column1,'[A-Z]{2}[0-9]{1,3}[A-Z]{2}') as C
    ,REGEXP_LIKE(column1,'[A-Z]{1,2}[0-9]+.*') as D
FROM VALUES ('AB101TZ'), ('_AB101TZ'), ('AA0000'), ('A00000');

thus this gives:

COLUMN1   A       B       C       D
AB101TZ   FALSE   FALSE   TRUE    TRUE
_AB101TZ  FALSE   FALSE   FALSE   FALSE
AA0000    TRUE    FALSE   FALSE   TRUE
A00000    FALSE   TRUE    FALSE   TRUE

so A & B are your to matches, but with the '$' removed to show what they do match as is. And thus why your input in not matching because TZ is not a number

C is Greg's solution. Which will not what your second filter ^[A-Z][0-9]+' would match, thus I made D which allows 1 or 2 characters, then some numbers, then anything. Anyways it should be possible to see how to mix and match those matching parts to match the data you have, in the format that is correct for you.

Simeon Pilgrim
  • 22,906
  • 3
  • 32
  • 45
1

REGEXP_LIKE has to match the whole string, not just a part of it, in order to return true.

You'd probably want something like this (in Snowflake REGEXP are automatically anchored):

[A-Z]{2}[0-9]{1,3}[A-Z]{2}

This will match the pattern 2 uppercase letters, followed by between 1 and 3 numeric digits, followed by 2 uppercase letters. You can adjust as appropriate for the postal codes in the nation(s) you're coding.

Greg Pavlik
  • 10,089
  • 2
  • 12
  • 29
0

Also as a workaround for the REGEXP_LIKE you can use REGEXP_SUBTR(...) IS NOT NULL this will work because the REGEXP_SUBSTR does not apply automatic anchoring.

orellabac
  • 2,077
  • 2
  • 26
  • 34
  • can you elaborate on that? I have a case where REGEXP_SUBSTR returns the string I want but REGEXP_LIKE returns false and I stumbled upon this thread. – Moseleyi Sep 06 '22 at 08:09
  • 1
    Sure. If you read [here](https://docs.snowflake.com/en/sql-reference/functions/regexp_like.html#usage-notes) "The function implicitly anchors a pattern at both ends (i.e. '' automatically becomes '^$', and 'ABC' automatically becomes '^ABC$'). To match any string starting with ABC, the pattern would be 'ABC.*'. " – orellabac Sep 06 '22 at 19:05
  • 1
    That is why is you need to use those anchors the REGEXP_SUBSTR will help in that situation. @Moseleyi does that clarifies your question. – orellabac Sep 06 '22 at 19:07