Well, I guess the comments below your question have already gone in detail into why the way the data is stored in is a bad idea and in what ways what you attempt is likely to fail if your assumptions about the addresses are not 100% accurate.
That being said, you might be in the process of trying to remedy that and if you are sure that the rule you defined always fits, you can do what you want.
To achieve that, use the REVERSE()
function twice to find the last occurrence of a substring. This way, you can separate everything starting at the last comma. Use the TRIM()
function on that to remove surrounding whitespace that may occur, e.g. if there are spaces directly following the comma.
If you then are sure there are only numeric characters and spaces in the postcode and the postcode follows immediately, you can use PATINDEX
together with LEFT()
and RIGHT()
to find the first non-numeric character, similar to this SO post.
Full solution: For the input data
address |
Marco Polo street 8a, 44000 Vienna |
Marco Polo street 8a, 44000 Vienna Old Town |
Marco Polo street 8a,44000 Vienna |
Marco Polo street 8a, Marco Polo street 8b, 44 000 Vienna Old Town |
Marco Polo street 8a, 44 00 0 Vienna |
the statement
with tmp as (
select TRIM(REVERSE(SUBSTRING(REVERSE(address),0,CHARINDEX(',',REVERSE(address))))) lastpart
from addresstable
)
select LEFT(lastpart, PATINDEX('%[^0-9 ]%', lastpart) -1) as postcode,
RIGHT(lastpart, len(lastpart) - PATINDEX('%[^0-9 ]%', lastpart) + 1) as city
from tmp
produces the result
postcode |
city |
44000 |
Vienna |
44000 |
Vienna Old Town |
44000 |
Vienna |
44 000 |
Vienna Old Town |
44 00 0 |
Vienna |
Notes:
- if you don't want to allow spaces, remove them from the expression in
PATINDEX()
, i.e. '%[^0-9]%'
- if you want your
postcode
to be of type int
, use TRY_CONVERT()
together with REPLACE()
and NULLIF()
. See this SO answer and dbfiddle below. Be advised that this removes possible leading zeroes that may belong to the postcode.
Alternatively, if you are sure that whatever string follows after the last comma until the next whitespace positively is the postcode
and you want to allow alphabetic characters in the postcode, your attempt from before already does it. This will however break if your postcode has spaces.
with tmp as (
select TRIM(REVERSE(SUBSTRING(REVERSE(address),0,CHARINDEX(',',REVERSE(address))))) lastpart
from addresstable
)
select SUBSTRING(lastpart,0,CHARINDEX(' ',lastpart,0)+1) as postcode,
SUBSTRING(lastpart,CHARINDEX(' ',lastpart,0)+1,LEN(lastpart)) as city
from tmp
See this db<>fiddle for a comparison of the two, including a few examples of addresses that will not work with one of the two.