2

I was wondering how to go about finding a postcode in an address string and making it its own variable using regex in .

for example $address = '123 My Street, My Area, My City, AA11 1AA'

I want $postcode = 'AA11 1AA'

I also want to remove the postcode that's found from the address string.

I have this so far.

$address = strtolower(preg_replace('/[^a-zA-Z0-9_ %\[\]\.\(\)%&-]/s', '', $data[2]));

$postcode = preg_match("/^(([A-PR-UW-Z]{1}[A-IK-Y]?)([0-9]?[A-HJKS-UW]?[ABEHMNPRVWXY]?|[0-9]?[0-9]?))\s?([0-9]{1}[ABD-HJLNP-UW-Z]{2})$/i",$address,$post);
$postcode = $post;
Steve Chambers
  • 37,270
  • 24
  • 156
  • 208
Hemm K
  • 469
  • 7
  • 21
  • 1
    Can you post your code that you have so far? We will help you improve your regex, but we will not simply throw solutions at you upon request. – Zim84 Aug 14 '13 at 14:37
  • Is the postcode always after a period? – John Dorean Aug 14 '13 at 14:38
  • it would be after a ',' or not. so splitting the string may not be the best option. I just want to be able to read the postcode via a regex, which, above, i believe to be right. – Hemm K Aug 14 '13 at 14:44
  • The initial `^` in your regexp means that the postcode will only be detected if it occurs at the beginning of your string: which (with the trailing `$`) means that the postcode must be the entirety of the `$address` string – Mark Baker Aug 14 '13 at 14:45
  • The regexp won't verify perfectly valid UK postcodes like `WN1A 4WW` – Mark Baker Aug 14 '13 at 14:46

5 Answers5

5

This worked for me:

$value = "Big Ben, Westminster, London, SW1A 0AA, UK";

$pattern = "/((GIR 0AA)|((([A-PR-UWYZ][0-9][0-9]?)|(([A-PR-UWYZ][A-HK-Y][0-9][0-9]?)|(([A-PR-UWYZ][0-9][A-HJKSTUW])|([A-PR-UWYZ][A-HK-Y][0-9][ABEHMNPRVWXY])))) [0-9][ABD-HJLNP-UW-Z]{2}))/i";

preg_match($pattern, $value, $matches);

$value = $matches[0]; // Will give you SW1A 0AA

http://www.sitepoint.com/community/t/extract-uk-postcode-from-string/31962

featherbelly
  • 935
  • 11
  • 16
1

You could try spliting the string by ", ". Then, the postal code will be last item of the resulting array (I don't know much about php, but that's my first though of how you could do it).

Anna
  • 538
  • 2
  • 4
  • 13
  • that's a good option but the issue with that would be is if the address had no ',' before the postcode. – Hemm K Aug 14 '13 at 14:43
  • @HemmK, well, are you sure the postcode is always at the end ? how much caracter does it contain ? these info are needed so we can help – Enissay Aug 14 '13 at 14:55
1

If you want to go overkill on this and handle all possible postcode variants, would suggest using the "official" UK Government Data Standard postcode regular expression, as described here: RegEx for matching UK Postcodes. So something like:

$postcodeRegex = "/(GIR 0AA)|((([A-Z-[QVX]][0-9][0-9]?)|(([A-Z-[QVX]][A-Z-[IJZ]][0-9][0-9]?)|(([A-Z-[QVX]][0-9][A-HJKSTUW])|([A-Z-[QVX]][A-Z-[IJZ]][0-9][ABEHMNPRVWXY])))) [0-9][A-Z-[CIKMOV]]{2})/i";
if (preg_match($postcodeRegex, $address, $matches))
{
    $postcode = $matches[0];
}

(This gives the general idea but it's possible the regex might need slightly adjusting as regex flavours can differ a bit).

Steve Chambers
  • 37,270
  • 24
  • 156
  • 208
  • 1
    I get preg_match(): Unknown modifier '|' running this. Any chance of clarifying for 2020 ? – AdamJones Jan 03 '20 at 12:09
  • Looks like the `/` `/` around the regex were missing - have updated the answer but am unable to test it right now so please let me know if it works. – Steve Chambers Jan 03 '20 at 13:26
  • It doesn't error any more but I get no matches. For what it's worth, running the expression through a useful tool I have (RegExBuddy) it tells me that under PHP preg evaluation 'PHP preg does not support character class subtraction'. Is this regex definitely compatible ? – AdamJones Jan 03 '20 at 18:19
  • To be honest the answer was only really intended as a starting point and as you're discovering there's some further work needed. Also worth noting that further useful info has been posted on the linked question since my answer was posted. In particular, I'd recommend reading [this answer](https://stackoverflow.com/questions/164979/regex-for-matching-uk-postcodes/51885364#51885364) and choosing one of the suggested revised regular expressions (ideally one of the simplest ones but this will depend on what the requirements are). – Steve Chambers Jan 03 '20 at 21:40
  • ...and if you need help getting this working in PHP, let me know which version you ended up picking and I could try to assist. – Steve Chambers Jan 03 '20 at 21:40
1

This regex hope help you

$address = '123 My Street, My Area, My City. AA11 1AA';

preg_match('/(.*)\.([^.]*)$/', $address, $matches);
$postcode = $matches[2];

### Output    
var_dump($matches);
array (size=3)
  0 => string '123 My Street, My Area, My City. AA11 1AA' (length=41)
  1 => string '123 My Street, My Area, My City' (length=31)
  2 => string ' AA11 1AA' (length=9)
Bora
  • 10,529
  • 5
  • 43
  • 73
1

If it's always in the order you've shown, you can use the following. I'm using a positive look ahead assertion for the (?=,) comma after the first group, followed by a literal comma ,. Then I'm using a positive look behind assertion for a comma (?<=,), followed by a potential (multiple) whitespace character \s* (which we are not capturing in a group), followed by the rest of the characters in the string. Since the whole string in its entirety has to be true for a match, the string only matches the way you've indicated (which is why there aren't multiple grouping pairs).

<?php
$address = "123 My Street, My Area, My City, AA11 1AA";
$splitter = "!(.*)(?=,),(?<=,)\s*(.*)!";
preg_match_all($splitter,$address,$matches);

print_r($matches);

$short_addr = $matches[1][0];
$postal_code = $matches[2][0];

?>

Output

Array
(
    [0] => Array
        (
            [0] => 123 My Street, My Area, My City, AA11 1AA
        )

    [1] => Array
        (
            [0] => 123 My Street, My Area, My City
        )

    [2] => Array
        (
            [0] => AA11 1AA
        )

)   
AbsoluteƵERØ
  • 7,816
  • 2
  • 24
  • 35