239

I'm after a regex that will validate a full complex UK postcode only within an input string. All of the uncommon postcode forms must be covered as well as the usual. For instance:

Matches

  • CW3 9SS
  • SE5 0EG
  • SE50EG
  • se5 0eg
  • WC2H 7LT

No Match

  • aWC2H 7LT
  • WC2H 7LTa
  • WC2H

How do I solve this problem?

Emma
  • 27,428
  • 11
  • 44
  • 69
Kieran Benton
  • 8,739
  • 12
  • 53
  • 77
  • 2
    @axrwkr that doesn't look helpful – Kieran Benton Jun 25 '13 at 11:13
  • 8
    [UK Postcode Validation - JavaScript and PHP](http://www.braemoor.co.uk/software/postcodes.shtml) I couldn't get the accepted answer to match valid postcodes but I found this and it does match valid postcodes. For client side validation, the JavaScript version can be used as is, for server side validation, rewriting the JavaScript as C# is fairly straightforward. It even reformats the postcode to have a space, so if you enter a postcode as W1A1AA, in addition to validating, it will reformat it to W1A 1AA. It even deals with unusual postcodes in various British territories. –  Jun 25 '13 at 11:38
  • 2
    Provided link does not work for "AA1A 1AA" formats. Reference: http://www.dhl.com.tw/content/dam/downloads/tw/express/forms/postcode_formats.pdf – Anthony Scaife Jul 18 '14 at 10:24
  • 2
    If you simply want to validate a postcode, we offer a free (sign up required) validation REST API endpoint - http://developers.alliescomputing.com/postcoder-web-api/address-lookup/validate-postcode – Stephen Keable Jan 14 '15 at 09:56
  • 1
    Good question. I think it would be worth including a central Manchester postcodes such as "M1 3HZ" in your list of uncommon examples that need to match. Many people aren't aware of the 1 letter 1 number combos. – Martin Joiner Dec 10 '17 at 15:05
  • 1
    **Many answers here are based off a broken regex provided by the UK government**. For a breakdown of these issues, please refer to [my answer here](https://stackoverflow.com/a/51885364/3600709) – ctwheels Aug 16 '18 at 21:22

33 Answers33

254

I'd recommend taking a look at the UK Government Data Standard for postcodes [link now dead; archive of XML, see Wikipedia for discussion]. There is a brief description about the data and the attached xml schema provides a regular expression. It may not be exactly what you want but would be a good starting point. The RegEx differs from the XML slightly, as a P character in third position in format A9A 9AA is allowed by the definition given.

The RegEx supplied by the UK Government was:

([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([A-Za-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9][A-Za-z]?))))\s?[0-9][A-Za-z]{2})

As pointed out on the Wikipedia discussion, this will allow some non-real postcodes (e.g. those starting AA, ZY) and they do provide a more rigorous test that you could try.

ctwheels
  • 21,901
  • 9
  • 42
  • 77
marcj
  • 914
  • 1
  • 7
  • 10
  • 1
    Looks like the new .gov has managed to 404 this page. Anyone got the original regex? – Tom Jul 01 '10 at 12:41
  • 55
    And that reg ex with an optional white space between the two segments (GIR 0AA)|((([A-Z-[QVX]][0-9][0-9]?)|(([A-Z-[QVX]][A-Z-[IJZ]][0-9][0-9]?)|(([A-Z-[QVX]][0-9][A-HJKSTUW])|([A-Z-[QVX]][A-Z-[IJZ]][0-9][ABEHMNPRVWXY]))))\s?[0-9][A-Z-[CIKMOV]]{2}) – gbro3n Jun 06 '12 at 18:06
  • 7
    Might be a good idea to bring the actual regex to the answer, since pages seem to expire every year... – pauloya Nov 21 '12 at 17:03
  • 1
    Doesn't look like even [Royal Mail](http://www.royalmail.com/postcode-finder) (switch to the "Address Finder" tab) can find that postcode, so I can understand why the regex fails - is this a very new postcode? In all these ones ending in a "K" is not allowed. – Zhaph - Ben Duguid Jan 12 '13 at 14:44
  • Doesn't work in a RegularExpressionValidator control in ASP.Net – user692942 Mar 07 '13 at 13:14
  • 1
    Am I wrong in thinking that this is not a standard regex? I don't recognise the `[A-Z-[QVX]]` syntax. – RichardTowers Jul 06 '13 at 21:30
  • It doesn't match EX31 3JB – gyozo kudor Jul 25 '13 at 07:43
  • 7
    Note this regex is for XML Schema, which is, obviously, slightly different from other regex flavours – artbristol Aug 06 '13 at 13:14
  • @artbristol Thanks for pointing that out. The [character class syntax that it uses](http://www.regular-expressions.info/xmlcharclass.html) means that this will not work in most other flavours. – RichardTowers Aug 20 '13 at 09:51
  • 1
    I keep finding postcodes that are valid but are not matched by this pattern. For example [N1P 1AA](http://www.doogal.co.uk/ShowMap.php?postcode=N1P%201AA) seems to be valid but is not matched. Maybe we should point this out in the answer as this kind of thing is serious (it might block users from registering). Usually you will prefer a more relaxed pattern than losing customers. – Cristian Vrabie Sep 07 '13 at 15:53
  • @CristianVrabie After a bit of back and forth (and reaching out to the Post Office for clarification) [Dan Solo seems to have the most accurate and up-to-date regex](http://stackoverflow.com/a/14257846/33051). – Zhaph - Ben Duguid Feb 17 '14 at 18:23
  • 6
    I can't get this to work in JavaScript. Does it only work with certain regex engines? – NickG Mar 27 '14 at 09:57
  • Here is a Javascript version (not equivalent to the above, but works): [A-Za-z]{1,2}[0-9][A-Za-z0-9]?\s?[0-9][ABD-HJLNP-UW-Zabd-hjlnp-uw-z]{2} – gbro3n Jun 29 '15 at 13:10
  • do any of these solutions allow white space around the postcode? – SuperUberDuper Nov 06 '15 at 11:56
  • 1
    I'm just learning about more complex regex's. Just curious, what is the need for the (GIR 0AA) at the beginning of this? Obviously a newbie to this! – Mike Apr 21 '16 at 14:25
  • 20
    Actually they changed it: [Bulk Data Transfer](https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/488478/Bulk_Data_Transfer_-_additional_validation_valid_from_12_November_2015.pdf): `^([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([AZa-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z]))))[0-9][A-Za-z]{2})$` – wieczorek1990 Jun 24 '16 at 14:22
  • 5
    This is taken from https://www.gov.uk/government/publications/bulk-data-transfer-for-sponsors-xml-schema corresponding to BS7666 and **it works with JavaScript**: `^([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([A-Za-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z])))) [0-9][A-Za-z]{2})$` – Gerard Brull Dec 12 '17 at 17:15
  • 2
    I think the UK government Regex is incorrect. The section "[A-Za‌​-z][A-Ha-hJ-Yj-y][0-‌​9]?[A-Za-z]" permits the outcode AAA, as far as i can see, 3 letters with no number are not a valid outcode (with the exception of GIR, which is already handled at the beginning of the regex) – zeocrash Jan 16 '18 at 14:46
  • @zeocrash I discovered this issue as well. I applied a fix to this answer considering it's been upvoted over 150 times and provided a corrected version in [my own answer](https://stackoverflow.com/a/51885364/3600709). – ctwheels Aug 16 '18 at 21:19
  • 1
    So many of these regexes are pointlessly complex. There's no reason you should have to have a regex that checks for upper & lower case. Just uppercase the string on input and suddenly `[A-Ha-hJ-Yj-y]` becomes `[A-HJ-Y]`. – Daniel Quinn Oct 28 '19 at 12:06
  • This is helpful, however eg @GerardBrull - I see many postcodes fail to be picked up by this, eg GU25 4SZ, G12 8EU, EH3 8DT, N7 7EL, OL5 0HQ - any advice? – user6122500 Dec 17 '20 at 13:03
  • Using your regex the only postcodes that did not pass my tests are the ones starting with NPT, but i think they are not used anymore – Dimitris Thomas Feb 10 '21 at 22:30
  • @DanielQuinn you don't even need to uppercase the string - just use the `i` flag on the regex to make it case-insensitive – ut9081 Feb 20 '23 at 14:44
221

I recently posted an answer to this question on UK postcodes for the R language. I discovered that the UK Government's regex pattern is incorrect and fails to properly validate some postcodes. Unfortunately, many of the answers here are based on this incorrect pattern.

I'll outline some of these issues below and provide a revised regular expression that actually works.


Note

My answer (and regular expressions in general):

  • Only validates postcode formats.
  • Does not ensure that a postcode legitimately exists.
    • For this, use an appropriate API! See Ben's answer for more info.

If you don't care about the bad regex and just want to skip to the answer, scroll down to the Answer section.

The Bad Regex

The regular expressions in this section should not be used.

This is the failing regex that the UK government has provided developers (not sure how long this link will be up, but you can see it in their Bulk Data Transfer documentation):

^([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([AZa-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z]))))[0-9][A-Za-z]{2})$

Problems

Problem 1 - Copy/Paste

See regex in use here.

As many developers likely do, they copy/paste code (especially regular expressions) and paste them expecting them to work. While this is great in theory, it fails in this particular case because copy/pasting from this document actually changes one of the characters (a space) into a newline character as shown below:

^([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([AZa-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z]))))
[0-9][A-Za-z]{2})$

The first thing most developers will do is just erase the newline without thinking twice. Now the regex won't match postcodes with spaces in them (other than the GIR 0AA postcode).

To fix this issue, the newline character should be replaced with the space character:

^([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([AZa-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z])))) [0-9][A-Za-z]{2})$
                                                                                                                                                     ^

Problem 2 - Boundaries

See regex in use here.

^([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([AZa-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z])))) [0-9][A-Za-z]{2})$
^^                     ^ ^                                                                                                                                            ^^

The postcode regex improperly anchors the regex. Anyone using this regex to validate postcodes might be surprised if a value like fooA11 1AA gets through. That's because they've anchored the start of the first option and the end of the second option (independently of one another), as pointed out in the regex above.

What this means is that ^ (asserts position at start of the line) only works on the first option ([Gg][Ii][Rr] 0[Aa]{2}), so the second option will validate any strings that end in a postcode (regardless of what comes before).

Similarly, the first option isn't anchored to the end of the line $, so GIR 0AAfoo is also accepted.

^([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([AZa-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z]))))[0-9][A-Za-z]{2})$

To fix this issue, both options should be wrapped in another group (or non-capturing group) and the anchors placed around that:

^(([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([AZa-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z])))) [0-9][A-Za-z]{2}))$
^^                                                                                                                                                                      ^^

Problem 3 - Improper Character Set

See regex in use here.

^([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([AZa-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z])))) [0-9][A-Za-z]{2})$
                                                                                       ^^

The regex is missing a - here to indicate a range of characters. As it stands, if a postcode is in the format ANA NAA (where A represents a letter and N represents a number), and it begins with anything other than A or Z, it will fail.

That means it will match A1A 1AA and Z1A 1AA, but not B1A 1AA.

To fix this issue, the character - should be placed between the A and Z in the respective character set:

^([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([A-Za-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z])))) [0-9][A-Za-z]{2})$
                                                                                        ^

Problem 4 - Wrong Optional Character Set

See regex in use here.

^([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([AZa-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z])))) [0-9][A-Za-z]{2})$
                                                                                                                                        ^

I swear they didn't even test this thing before publicizing it on the web. They made the wrong character set optional. They made [0-9] option in the fourth sub-option of option 2 (group 9). This allows the regex to match incorrectly formatted postcodes like AAA 1AA.

To fix this issue, make the next character class optional instead (and subsequently make the set [0-9] match exactly once):

^([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([AZa-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9][A-Za-z]?)))) [0-9][A-Za-z]{2})$
                                                                                                                                                ^

Problem 5 - Performance

Performance on this regex is extremely poor. First off, they placed the least likely pattern option to match GIR 0AA at the beginning. How many users will likely have this postcode versus any other postcode; probably never? This means every time the regex is used, it must exhaust this option first before proceeding to the next option. To see how performance is impacted check the number of steps the original regex took (35) against the same regex after having flipped the options (22).

The second issue with performance is due to the way the entire regex is structured. There's no point backtracking over each option if one fails. The way the current regex is structured can greatly be simplified. I provide a fix for this in the Answer section.

Problem 6 - Spaces

See regex in use here

This may not be considered a problem, per se, but it does raise concern for most developers. The spaces in the regex are not optional, which means the users inputting their postcodes must place a space in the postcode. This is an easy fix by simply adding ? after the spaces to render them optional. See the Answer section for a fix.


Answer

1. Fixing the UK Government's Regex

Fixing all the issues outlined in the Problems section and simplifying the pattern yields the following, shorter, more concise pattern. We can also remove most of the groups since we're validating the postcode as a whole (not individual parts):

See regex in use here

^([A-Za-z][A-Ha-hJ-Yj-y]?[0-9][A-Za-z0-9]? ?[0-9][A-Za-z]{2}|[Gg][Ii][Rr] ?0[Aa]{2})$

This can further be shortened by removing all of the ranges from one of the cases (upper or lower case) and using a case-insensitive flag. Note: Some languages don't have one, so use the longer one above. Each language implements the case-insensitivity flag differently.

See regex in use here.

^([A-Z][A-HJ-Y]?[0-9][A-Z0-9]? ?[0-9][A-Z]{2}|GIR ?0A{2})$

Shorter again replacing [0-9] with \d (if your regex engine supports it):

See regex in use here.

^([A-Z][A-HJ-Y]?\d[A-Z\d]? ?\d[A-Z]{2}|GIR ?0A{2})$

2. Simplified Patterns

Without ensuring specific alphabetic characters, the following can be used (keep in mind the simplifications from 1. Fixing the UK Government's Regex have also been applied here):

See regex in use here.

^([A-Z]{1,2}\d[A-Z\d]? ?\d[A-Z]{2}|GIR ?0A{2})$

And even further if you don't care about the special case GIR 0AA:

^[A-Z]{1,2}\d[A-Z\d]? ?\d[A-Z]{2}$

3. Complicated Patterns

I would not suggest over-verification of a postcode as new Areas, Districts and Sub-districts may appear at any point in time. What I will suggest potentially doing, is added support for edge-cases. Some special cases exist and are outlined in this Wikipedia article.

Here are complex regexes that include the subsections of 3. (3.1, 3.2, 3.3).

In relation to the patterns in 1. Fixing the UK Government's Regex:

See regex in use here

^(([A-Z][A-HJ-Y]?\d[A-Z\d]?|ASCN|STHL|TDCU|BBND|[BFS]IQQ|PCRN|TKCA) ?\d[A-Z]{2}|BFPO ?\d{1,4}|(KY\d|MSR|VG|AI)[ -]?\d{4}|[A-Z]{2} ?\d{2}|GE ?CX|GIR ?0A{2}|SAN ?TA1)$

And in relation to 2. Simplified Patterns:

See regex in use here

^(([A-Z]{1,2}\d[A-Z\d]?|ASCN|STHL|TDCU|BBND|[BFS]IQQ|PCRN|TKCA) ?\d[A-Z]{2}|BFPO ?\d{1,4}|(KY\d|MSR|VG|AI)[ -]?\d{4}|[A-Z]{2} ?\d{2}|GE ?CX|GIR ?0A{2}|SAN ?TA1)$

3.1 British Overseas Territories

The Wikipedia article currently states (some formats slightly simplified):

  • AI-1111: Anguila
  • ASCN 1ZZ: Ascension Island
  • STHL 1ZZ: Saint Helena
  • TDCU 1ZZ: Tristan da Cunha
  • BBND 1ZZ: British Indian Ocean Territory
  • BIQQ 1ZZ: British Antarctic Territory
  • FIQQ 1ZZ: Falkland Islands
  • GX11 1ZZ: Gibraltar
  • PCRN 1ZZ: Pitcairn Islands
  • SIQQ 1ZZ: South Georgia and the South Sandwich Islands
  • TKCA 1ZZ: Turks and Caicos Islands
  • BFPO 11: Akrotiri and Dhekelia
  • ZZ 11 & GE CX: Bermuda (according to this document)
  • KY1-1111: Cayman Islands (according to this document)
  • VG1111: British Virgin Islands (according to this document)
  • MSR 1111: Montserrat (according to this document)

An all-encompassing regex to match only the British Overseas Territories might look like this:

See regex in use here.

^((ASCN|STHL|TDCU|BBND|[BFS]IQQ|GX\d{2}|PCRN|TKCA) ?\d[A-Z]{2}|(KY\d|MSR|VG|AI)[ -]?\d{4}|(BFPO|[A-Z]{2}) ?\d{2}|GE ?CX)$

3.2 British Forces Post Office

Although they've been recently changed it to better align with the British postcode system to BF# (where # represents a number), they're considered optional alternative postcodes. These postcodes follow(ed) the format of BFPO, followed by 1-4 digits:

See regex in use here

^BFPO ?\d{1,4}$

3.3 Santa?

There's another special case with Santa (as mentioned in other answers): SAN TA1 is a valid postcode. A regex for this is very simply:

^SAN ?TA1$
ctwheels
  • 21,901
  • 9
  • 42
  • 77
  • 7
    The simplified patterns are a really good option to use. I find it's best not to be too restrictive with a regex as you then need to ensure it is updated with any changes or you could have very angry users. I feel its better to loosely match with a simplified regex to weed out the obvious errors and then apply further checks such as an address lookup (or confirmation email in the case of email regex) to confirm the validity. – James Coyle Mar 22 '19 at 09:20
  • 2
    Excellent and thorough analysis. – Steve May 24 '19 at 15:17
  • 3
    Brilliant answer on so many levels. Ultimately, I went with your 2nd simplified pattern. As I actually have a DB with all the UK postcodes in, I just need a first pass to see if an address string potentially contains a valid postcode, so I don't care about false positives (as the actual lookup will root them out), but I do care about false negatives. And speed also matters. – John Powell Jun 09 '20 at 09:03
  • There are so many issues with the UK postcode system, manifestly created by committee before the computer era, but the issue of variable length and spaces is one of the most pernicious. I have seen all manner of horrors, including padding postcodes like E1 5JX to E1 5JX, ie, with three spaces, so that it aligns nicely with SW18 5HA in Excel (insert hideously inappropriate software of choice for managing addresses). The only sane solution, IMHO, is to strip out all the spaces, so that the postcode is a single string before it gets anywhere near Elastic, Solr, Postgres, etc. – John Powell Jun 09 '20 at 09:19
  • @JohnPowell sanitization is always a good idea, but if that's not possible for whatever reason, you could simply replace my optional spaces ` ?` (0 or 1 space) with ` *` to indicate 0 or more spaces. – ctwheels Sep 28 '20 at 16:12
  • I ended up removing the spaces using regexp and your regex and also stored the postcodes without spaces in Elastc, as spces add no information. Works really well – John Powell Sep 28 '20 at 18:58
  • Excellent answer, using your expressions the only postcodes that didn't pass my tests are BF1 and NPT, but i think they are not used anymore. Using the expression in top answer (which is the official regex) only the ones starting with NPT do not pass. However yours is optimized for all the reasons you mentioned so i ll use yours. Thank you – Dimitris Thomas Feb 10 '21 at 22:28
  • @DimitrisThomas if you send me the `BF1` postcode that didn't pass (or at least the character representation similar to `BF1 0AA` (`A` = letter, `0` = digit), I can investigate. My answer should already work on that format. As for `NPT`, I excluded these as they've *supposedly* switched all `NPT` postcodes to `NP1-9` (there's mention of it [here](https://en.wikipedia.org/wiki/Postcodes_in_the_United_Kingdom#Modern_postcode_system) - stating that `NPT` lasted until 1984; I know it's not the most trustworthy source, but really the best resource I had to go off of for most of my answer) – ctwheels Feb 10 '21 at 22:59
  • @ctwheels Hey, i found the list here https://www.doogal.co.uk/ukpostcodes.php , download link is at the upper right Download -> Full List of Postcodes as csv . Looks like there are 48 of them and they are still in use. As i said not a big deal anyway – Dimitris Thomas Feb 11 '21 at 09:59
  • 5
    @Sunhat I don’t appreciate it being called a mess, I clearly detail every part of the post. My answer provides multiple answers because one solution doesn’t fit all problems. Take for instance that regex engines are all implemented differently, so while `\d` may work on most, it does not work on all. Add the fact that the UK government specifies character ranges rather than the entire alphabet and that different postcode formats exist for military, islands, etc. Automatically, with just those 3 criteria, you get 6 versions. I think I’ve done well at answering the question and 120+ others agree – ctwheels May 06 '21 at 13:12
  • No problem @ctwheels I'll use this ^SAN ?TA1$ considering that's the final answer – Sunhat May 06 '21 at 13:15
  • I'm sorry, but how is "AA111AA" a "valid" postcode?? – Mecanik Nov 01 '21 at 09:02
  • @Mecanik see [this](https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/611951/Appendix_C_ILR_2017_to_2018_v1_Published_28April17.pdf) document from UK Government that establishes format `AANN NAA` as a valid postcode format. Omit the space for ease of entry by user. – ctwheels Nov 01 '21 at 11:28
  • Note! The simplified regex does not match all partial postcodes, i.e. it doesn't match B16 or B1, but does match WV14, for example. This will fix that issue: /(([A-Z]{1,2}\d[A-Z\d]?|ASCN|STHL|TDCU|BBND|[BFS]IQQ|PCRN|TKCA) ?\d[A-Z]{2}|BFPO ?\d{1,4}|(KY\d|MSR|VG|AI)[ -]?\d{4}|[A-Z]{1,2} ?\d{1,2}|GE ?CX|GIR ?0A{2}|SAN ?TA1)/i – Steve Childs Mar 26 '22 at 10:59
  • 1
    This was so helpful to me; I just had to extract things that looked like postcodes from email data, so I used... [A-Z]{1,2}\d[A-Z\d]? ?\d[A-Z]{2} . However, the explanations were clear and precise, a shining example of what an SO post should be, 5 cups! – Steve Hibbert Apr 28 '23 at 11:05
86

It looks like we're going to be using ^(GIR ?0AA|[A-PR-UWYZ]([0-9]{1,2}|([A-HK-Y][0-9]([0-9ABEHMNPRV-Y])?)|[0-9][A-HJKPS-UW]) ?[0-9][ABD-HJLNP-UW-Z]{2})$, which is a slightly modified version of that sugested by Minglis above.

However, we're going to have to investigate exactly what the rules are, as the various solutions listed above appear to apply different rules as to which letters are allowed.

After some research, we've found some more information. Apparently a page on 'govtalk.gov.uk' points you to a postcode specification govtalk-postcodes. This points to an XML schema at XML Schema which provides a 'pseudo regex' statement of the postcode rules.

We've taken that and worked on it a little to give us the following expression:

^((GIR &0AA)|((([A-PR-UWYZ][A-HK-Y]?[0-9][0-9]?)|(([A-PR-UWYZ][0-9][A-HJKSTUW])|([A-PR-UWYZ][A-HK-Y][0-9][ABEHMNPRV-Y]))) &[0-9][ABD-HJLNP-UW-Z]{2}))$

This makes spaces optional, but does limit you to one space (replace the '&' with '{0,} for unlimited spaces). It assumes all text must be upper-case.

If you want to allow lower case, with any number of spaces, use:

^(([gG][iI][rR] {0,}0[aA]{2})|((([a-pr-uwyzA-PR-UWYZ][a-hk-yA-HK-Y]?[0-9][0-9]?)|(([a-pr-uwyzA-PR-UWYZ][0-9][a-hjkstuwA-HJKSTUW])|([a-pr-uwyzA-PR-UWYZ][a-hk-yA-HK-Y][0-9][abehmnprv-yABEHMNPRV-Y]))) {0,}[0-9][abd-hjlnp-uw-zABD-HJLNP-UW-Z]{2}))$

This doesn't cover overseas territories and only enforces the format, NOT the existence of different areas. It is based on the following rules:

Can accept the following formats:

  • “GIR 0AA”
  • A9 9ZZ
  • A99 9ZZ
  • AB9 9ZZ
  • AB99 9ZZ
  • A9C 9ZZ
  • AD9E 9ZZ

Where:

  • 9 can be any single digit number.
  • A can be any letter except for Q, V or X.
  • B can be any letter except for I, J or Z.
  • C can be any letter except for I, L, M, N, O, P, Q, R, V, X, Y or Z.
  • D can be any letter except for I, J or Z.
  • E can be any of A, B, E, H, M, N, P, R, V, W, X or Y.
  • Z can be any letter except for C, I, K, M, O or V.

Best wishes

Colin

Umber Ferrule
  • 3,358
  • 6
  • 35
  • 38
Colin
  • 1,141
  • 1
  • 9
  • 9
  • 2
    Great answer, I added in the overseas ones `^(([gG][iI][rR] {0,}0[aA]{2})|(([aA][sS][cC][nN]|[sS][tT][hH][lL]|[tT][dD][cC][uU]|[bB][bB][nN][dD]|[bB][iI][qQ][qQ]|[fF][iI][qQ][qQ]|[pP][cC][rR][nN]|[sS][iI][qQ][qQ]|[iT][kK][cC][aA]) {0,}1[zZ]{2})|((([a-pr-uwyzA-PR-UWYZ][a-hk-yxA-HK-XY]?[0-9][0-9]?)|(([a-pr-uwyzA-PR-UWYZ][0-9][a-hjkstuwA-HJKSTUW])|([a-pr-uwyzA-PR-UWYZ][a-hk-yA-HK-Y][0-9][abehmnprv-yABEHMNPRV-Y]))) {0,}[0-9][abd-hjlnp-uw-zABD-HJLNP-UW-Z]{2}))$` – David Bradshaw Nov 22 '16 at 17:12
  • Why specify `{0,}` instead of `*` for unlimited, optional spaces? – Code Animal Aug 01 '17 at 10:52
48

There is no such thing as a comprehensive UK postcode regular expression that is capable of validating a postcode. You can check that a postcode is in the correct format using a regular expression; not that it actually exists.

Postcodes are arbitrarily complex and constantly changing. For instance, the outcode W1 does not, and may never, have every number between 1 and 99, for every postcode area.

You can't expect what is there currently to be true forever. As an example, in 1990, the Post Office decided that Aberdeen was getting a bit crowded. They added a 0 to the end of AB1-5 making it AB10-50 and then created a number of postcodes in between these.

Whenever a new street is build a new postcode is created. It's part of the process for obtaining permission to build; local authorities are obliged to keep this updated with the Post Office (not that they all do).

Furthermore, as noted by a number of other users, there's the special postcodes such as Girobank, GIR 0AA, and the one for letters to Santa, SAN TA1 - you probably don't want to post anything there but it doesn't appear to be covered by any other answer.

Then, there's the BFPO postcodes, which are now changing to a more standard format. Both formats are going to be valid. Lastly, there's the overseas territories source Wikipedia.

+----------+----------------------------------------------+
| Postcode |                   Location                   |
+----------+----------------------------------------------+
| AI-2640  | Anguilla                                     |
| ASCN 1ZZ | Ascension Island                             |
| STHL 1ZZ | Saint Helena                                 |
| TDCU 1ZZ | Tristan da Cunha                             |
| BBND 1ZZ | British Indian Ocean Territory               |
| BIQQ 1ZZ | British Antarctic Territory                  |
| FIQQ 1ZZ | Falkland Islands                             |
| GX11 1AA | Gibraltar                                    |
| PCRN 1ZZ | Pitcairn Islands                             |
| SIQQ 1ZZ | South Georgia and the South Sandwich Islands |
| TKCA 1ZZ | Turks and Caicos Islands                     |
+----------+----------------------------------------------+

Next, you have to take into account that the UK "exported" its postcode system to many places in the world. Anything that validates a "UK" postcode will also validate the postcodes of a number of other countries.

If you want to validate a UK postcode the safest way to do it is to use a look-up of current postcodes. There are a number of options:

  • Ordnance Survey releases Code-Point Open under an open data licence. It'll be very slightly behind the times but it's free. This will (probably - I can't remember) not include Northern Irish data as the Ordnance Survey has no remit there. Mapping in Northern Ireland is conducted by the Ordnance Survey of Northern Ireland and they have their, separate, paid-for, Pointer product. You could use this and append the few that aren't covered fairly easily.

  • Royal Mail releases the Postcode Address File (PAF), this includes BFPO which I'm not sure Code-Point Open does. It's updated regularly but costs money (and they can be downright mean about it sometimes). PAF includes the full address rather than just postcodes and comes with its own Programmers Guide. The Open Data User Group (ODUG) is currently lobbying to have PAF released for free, here's a description of their position.

  • Lastly, there's AddressBase. This is a collaboration between Ordnance Survey, Local Authorities, Royal Mail and a matching company to create a definitive directory of all information about all UK addresses (they've been fairly successful as well). It's paid-for but if you're working with a Local Authority, government department, or government service it's free for them to use. There's a lot more information than just postcodes included.

Tony
  • 9,672
  • 3
  • 47
  • 75
Ben
  • 51,770
  • 36
  • 127
  • 149
22
^([A-PR-UWYZ0-9][A-HK-Y0-9][AEHMNPRTVXY0-9]?[ABEHMNPRVWXY0-9]? {1,2}[0-9][ABD-HJLN-UW-Z]{2}|GIR 0AA)$

Regular expression to match valid UK postcodes. In the UK postal system not all letters are used in all positions (the same with vehicle registration plates) and there are various rules to govern this. This regex takes into account those rules. Details of the rules: First half of postcode Valid formats [A-Z][A-Z][0-9][A-Z] [A-Z][A-Z][0-9][0-9] [A-Z][0-9][0-9] [A-Z][A-Z][0-9] [A-Z][A-Z][A-Z] [A-Z][0-9][A-Z] [A-Z][0-9] Exceptions Position - First. Contraint - QVX not used Position - Second. Contraint - IJZ not used except in GIR 0AA Position - Third. Constraint - AEHMNPRTVXY only used Position - Forth. Contraint - ABEHMNPRVWXY Second half of postcode Valid formats [0-9][A-Z][A-Z] Exceptions Position - Second and Third. Contraint - CIKMOV not used

http://regexlib.com/REDetails.aspx?regexp_id=260

Dan
  • 29,100
  • 43
  • 148
  • 207
  • 1
    No idea why people have downvoted this answer - it's the correct regex – Ollie Mar 25 '10 at 17:59
  • The regex does not work for postal codes "YO31" and "YO31 1" in Javscript. – Pratik Khadloya Dec 08 '11 at 01:12
  • 9
    I don't think this is correct, since the regex given contradicts the description, and suggests you can have postcodes starting with `0-9`, which you can't – Luigi Plinge Apr 26 '12 at 20:05
  • 4
    This regex fails on about 6000 valid postcodes, so I'd recommend against it. See [my answer](http://stackoverflow.com/a/17507615/1344760). – RichardTowers Jul 06 '13 at 22:20
  • this fails on any postcode in lowercase or without a space for me – Dancer Dec 17 '14 at 17:01
  • @Dancer To keep these regex's even remotely manageable, they tend to support either upper or lower case postcodes, but not both. The documentation uses upper-case throughout. From a validation point of view, you would write it to support one, and change the case as needed. With regards to the space issue, the docs state "The first part, or Outward Code, is separated from the second part, the Inward Code, by a single space" - the space is therefore required. – Zhaph - Ben Duguid Aug 03 '15 at 17:08
21

I had a look into some of the answers above and I'd recommend against using the pattern from @Dan's answer (c. Dec 15 '10), since it incorrectly flags almost 0.4% of valid postcodes as invalid, while the others do not.

Ordnance Survey provide service called Code Point Open which:

contains a list of all the current postcode units in Great Britain

I ran each of the regexs above against the full list of postcodes (Jul 6 '13) from this data using grep:

cat CSV/*.csv |
    # Strip leading quotes
    sed -e 's/^"//g' |
    # Strip trailing quote and everything after it
    sed -e 's/".*//g' |
    # Strip any spaces
    sed -E -e 's/ +//g' |
    # Find any lines that do not match the expression
    grep --invert-match --perl-regexp "$pattern"

There are 1,686,202 postcodes total.

The following are the numbers of valid postcodes that do not match each $pattern:

'^([A-PR-UWYZ0-9][A-HK-Y0-9][AEHMNPRTVXY0-9]?[ABEHMNPRVWXY0-9]?[0-9][ABD-HJLN-UW-Z]{2}|GIR 0AA)$'
# => 6016 (0.36%)
'^(GIR ?0AA|[A-PR-UWYZ]([0-9]{1,2}|([A-HK-Y][0-9]([0-9ABEHMNPRV-Y])?)|[0-9][A-HJKPS-UW]) ?[0-9][ABD-HJLNP-UW-Z]{2})$'
# => 0
'^GIR[ ]?0AA|((AB|AL|B|BA|BB|BD|BH|BL|BN|BR|BS|BT|BX|CA|CB|CF|CH|CM|CO|CR|CT|CV|CW|DA|DD|DE|DG|DH|DL|DN|DT|DY|E|EC|EH|EN|EX|FK|FY|G|GL|GY|GU|HA|HD|HG|HP|HR|HS|HU|HX|IG|IM|IP|IV|JE|KA|KT|KW|KY|L|LA|LD|LE|LL|LN|LS|LU|M|ME|MK|ML|N|NE|NG|NN|NP|NR|NW|OL|OX|PA|PE|PH|PL|PO|PR|RG|RH|RM|S|SA|SE|SG|SK|SL|SM|SN|SO|SP|SR|SS|ST|SW|SY|TA|TD|TF|TN|TQ|TR|TS|TW|UB|W|WA|WC|WD|WF|WN|WR|WS|WV|YO|ZE)(\d[\dA-Z]?[ ]?\d[ABD-HJLN-UW-Z]{2}))|BFPO[ ]?\d{1,4}$'
# => 0

Of course, these results only deal with valid postcodes that are incorrectly flagged as invalid. So:

'^.*$'
# => 0

I'm saying nothing about which pattern is the best regarding filtering out invalid postcodes.

Community
  • 1
  • 1
RichardTowers
  • 4,682
  • 1
  • 26
  • 43
  • 1
    Isn't this what I say in my answer and if you're going down the disproof route you should probably do them all, and keep it updated if someone changes their answer? If not, at least reference the date of the last edit of the answer you took it from so people can see whether it's been changed since. – Ben Jul 10 '13 at 06:04
  • Fair point. Edited accordingly. I think it adds to the discussion to point out that most of these patterns don't exclude any of the CPO codes, but that the most upvoted (valid regex) answer does. Future readers: be aware that my results are likely to be out of date. – RichardTowers Jul 10 '13 at 18:47
13

Most of the answers here didn't work for all the postcodes I have in my database. I finally found one that validates with all, using the new regex provided by the government:

https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/413338/Bulk_Data_Transfer_-_additional_validation_valid_from_March_2015.pdf

It isn't in any of the previous answers so I post it here in case they take the link down:

^([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([A-Za-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z])))) [0-9][A-Za-z]{2})$

UPDATE: Updated regex as pointed by Jamie Bull. Not sure if it was my error copying or it was an error in the government's regex, the link is down now...

UPDATE: As ctwheels found, this regex works with the javascript regex flavor. See his comment for one that works with the pcre (php) flavor.

Jesús Carrera
  • 11,275
  • 4
  • 63
  • 55
  • 1
    `^([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([AZa-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z])))) [0-9][A-Za-z]{2})$` should be `^([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([A-Za-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z])))) [0-9][A-Za-z]{2})$` - spot the difference ;-) – Jamie Bull May 16 '14 at 13:08
  • 2
    This is the only answer here that has worked in http://www.regexr.com/ and Notepad++. Although, I had change it to `([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([A-Za-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z])))) ?[0-9][A-Za-z]{2})` (removed `^` and `$` and added a `?` after the space) for http://www.regexr.com/ to find more than one result and for both to find a result that doesn't have a space seperator. – mythofechelon Feb 26 '15 at 23:59
  • @ctwheels this regex is for the javascript flavor. If your in fail link you select javascript it will work. That's a great catch and I'll update my answer. – Jesús Carrera Aug 14 '18 at 11:06
  • @JesúsCarrera my apologies, I posted the wrong link. I'll repost it below with the correct link and remove the old one after. – ctwheels Aug 14 '18 at 14:19
  • 1
    The regex posted in the documentation is inherently incorrect. The entire expression should be wrapped in a non-capturing group `(?:)` and then anchors placed around it. See it fail [here](https://regex101.com/r/KleL5c/1). For more information, [see my answer here](https://stackoverflow.com/a/51828886/3600709). `^(?:([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([A-Za-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z])))) [0-9][A-Za-z]{2}))$` is the corrected regular expression. – ctwheels Aug 14 '18 at 14:34
  • @JesúsCarrera the regex I posted above is the corrected regex for many flavours (not just PCRE). It's the corrected version for PHP, JavaScript, Python, etc. – ctwheels Aug 14 '18 at 14:35
  • @ctwheels I see how the regex from the gov docs fails for your examples, I will update to use your modification which seems to work better, however, I can see your modification matching correct postcodes only in the js flavor, did you check that? – Jesús Carrera Aug 15 '18 at 09:53
  • @JesúsCarrera Don't want to take up any more of your time mate but is there an easy way to change this to just check and match the first part of a postcode eg L2, OX12, SW4? I am so hopeless at Regex. – Freddie Ergatoudis Aug 15 '18 at 11:26
  • @JesúsCarrera pretty sure it's something like this: `^([Gg][Ii][Rr] 0[Aa]{2})|([A-Za-z][0-9]{1,2})|([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|([AZa-z][0-9][A-Za-z])` but would appreciate if you or anyone else could take a look for faults :) – Freddie Ergatoudis Aug 15 '18 at 12:15
13

According to this Wikipedia table

enter image description here

This pattern cover all the cases

(?:[A-Za-z]\d ?\d[A-Za-z]{2})|(?:[A-Za-z][A-Za-z\d]\d ?\d[A-Za-z]{2})|(?:[A-Za-z]{2}\d{2} ?\d[A-Za-z]{2})|(?:[A-Za-z]\d[A-Za-z] ?\d[A-Za-z]{2})|(?:[A-Za-z]{2}\d[A-Za-z] ?\d[A-Za-z]{2})

When using it on Android\Java use \\d

AntPachon
  • 1,152
  • 12
  • 14
  • I found this the most readable answer, although it only looks for form of a postcode, rather than actual valid codes as per the solutions which take the info from the gov.uk website, but that's good enough for my use case. After playing with it a bit (in python), I factored it out to a slightly more compact but equivalent regex which also allows for an optional space: ([a-zA-Z](?:(?:[a-zA-Z]?\d[a-zA-Z])|(?:\d{1,2})|(?:[a-zA-Z]\d{1,2}))\W?[0-9][a-zA-Z]{2}) – Richard J Jul 21 '15 at 17:58
12

This is the regex Google serves on their i18napis.appspot.com domain:

GIR[ ]?0AA|((AB|AL|B|BA|BB|BD|BH|BL|BN|BR|BS|BT|BX|CA|CB|CF|CH|CM|CO|CR|CT|CV|CW|DA|DD|DE|DG|DH|DL|DN|DT|DY|E|EC|EH|EN|EX|FK|FY|G|GL|GY|GU|HA|HD|HG|HP|HR|HS|HU|HX|IG|IM|IP|IV|JE|KA|KT|KW|KY|L|LA|LD|LE|LL|LN|LS|LU|M|ME|MK|ML|N|NE|NG|NN|NP|NR|NW|OL|OX|PA|PE|PH|PL|PO|PR|RG|RH|RM|S|SA|SE|SG|SK|SL|SM|SN|SO|SP|SR|SS|ST|SW|SY|TA|TD|TF|TN|TQ|TR|TS|TW|UB|W|WA|WC|WD|WF|WN|WR|WS|WV|YO|ZE)(\d[\dA-Z]?[ ]?\d[ABD-HJLN-UW-Z]{2}))|BFPO[ ]?\d{1,4}
Alix Axel
  • 151,645
  • 95
  • 393
  • 500
12

An old post but still pretty high in google results so thought I'd update. This Oct 14 doc defines the UK postcode regular expression as:

^([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([**AZ**a-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z])))) [0-9][A-Za-z]{2})$

from:

https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/359448/4__Bulk_Data_Transfer_-_additional_validation_valid.pdf

The document also explains the logic behind it. However, it has an error (bolded) and also allows lower case, which although legal is not usual, so amended version:

^(GIR 0AA)|((([A-Z][0-9]{1,2})|(([A-Z][A-HJ-Y][0-9]{1,2})|(([A-Z][0-9][A-Z])|([A-Z][A-HJ-Y][0-9]?[A-Z])))) [0-9][A-Z]{2})$

This works with new London postcodes (e.g. W1D 5LH) that previous versions did not.

Vivek Jain
  • 3,811
  • 6
  • 30
  • 47
deadcrab
  • 433
  • 3
  • 7
  • It looks like the error you highlighted in bold has been fixed in the document but I still prefer your regular expression as it is easier to read. – Professor of programming Sep 27 '15 at 00:01
  • 5
    The only thing I would say is make the space optional by changing the space to \s? as the space isn't a requirement it for readability. – Professor of programming Sep 27 '15 at 00:07
  • The regex posted in the documentation is inherently incorrect. The entire expression should be wrapped in a non-capturing group `(?:)` and then anchors placed around it. See it fail [here](https://regex101.com/r/KleL5c/1). For more information, [see my answer here](https://stackoverflow.com/a/51828886/3600709). `^(?:([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([A-Za-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z])))) [0-9][A-Za-z]{2}))$` is the corrected regular expression. – ctwheels Aug 14 '18 at 14:34
10

I've been looking for a UK postcode regex for the last day or so and stumbled on this thread. I worked my way through most of the suggestions above and none of them worked for me so I came up with my own regex which, as far as I know, captures all valid UK postcodes as of Jan '13 (according to the latest literature from the Royal Mail).

The regex and some simple postcode checking PHP code is posted below. NOTE:- It allows for lower or uppercase postcodes and the GIR 0AA anomaly but to deal with the, more than likely, presence of a space in the middle of an entered postcode it also makes use of a simple str_replace to remove the space before testing against the regex. Any discrepancies beyond that and the Royal Mail themselves don't even mention them in their literature (see http://www.royalmail.com/sites/default/files/docs/pdf/programmers_guide_edition_7_v5.pdf and start reading from page 17)!

Note: In the Royal Mail's own literature (link above) there is a slight ambiguity surrounding the 3rd and 4th positions and the exceptions in place if these characters are letters. I contacted Royal Mail directly to clear it up and in their own words "A letter in the 4th position of the Outward Code with the format AANA NAA has no exceptions and the 3rd position exceptions apply only to the last letter of the Outward Code with the format ANA NAA." Straight from the horse's mouth!

<?php

    $postcoderegex = '/^([g][i][r][0][a][a])$|^((([a-pr-uwyz]{1}([0]|[1-9]\d?))|([a-pr-uwyz]{1}[a-hk-y]{1}([0]|[1-9]\d?))|([a-pr-uwyz]{1}[1-9][a-hjkps-uw]{1})|([a-pr-uwyz]{1}[a-hk-y]{1}[1-9][a-z]{1}))(\d[abd-hjlnp-uw-z]{2})?)$/i';

    $postcode2check = str_replace(' ','',$postcode2check);

    if (preg_match($postcoderegex, $postcode2check)) {

        echo "$postcode2check is a valid postcode<br>";

    } else {

        echo "$postcode2check is not a valid postcode<br>";

    }

?>

I hope it helps anyone else who comes across this thread looking for a solution.

Dan Solo
  • 697
  • 1
  • 8
  • 21
  • 1
    I'd be curious to know which example postcodes were failing the published one? – Zhaph - Ben Duguid Jan 12 '13 at 14:37
  • I can't give you a specific postcode (without having access to the full PAF list) but postcodes with the format ANA NAA would potentially fail as the letters P and Q are allowed in the 3rd position and postcodes with the format AANA NAA would potentially also fail as the 4th position allows all letters (the regex given in the accepted answer above does not account for either of these). As I say I'm only going by the current advice from the Royal Mail - at the time of the answer above, maybe that regex was fully compliant. – Dan Solo Jan 14 '13 at 11:09
  • Thanks for the heads up - I can see that "P" appears to have been added as acceptable in the third position (from your linked doc), but not Q - but where are you reading that "the 4th position allows all letters"? The doc doesn't mention the "forth position" at all as far as I can see, so I'd read that as "the third letter regardless of actual position". – Zhaph - Ben Duguid Jan 14 '13 at 14:00
  • Good point on the Q - error on my part! However it becomes a matter of interpretation on the 3rd/4th letters and I'm not sure which one of us is right. The doc specifically mentions the 1st and 2nd letters as being the "first/second alpha position" but the 3rd one only as being the "third position". I interpreted this as the 3rd character along (alpha or numeric) in a postcode like A1B 2DE. Otherwise surely the B in the example above could potentially be translated as the letter in the 2nd alpha position making the published regex wrong anyway? – Dan Solo Jan 14 '13 at 15:42
  • Agreed there's no mention of the 4th either way but I guess I'm being consistent with following my logic through... (Also just noticed that the approved answer above has another list of letter exceptions for the 4th character in the postcode format AANA - none of which are mentioned at all in the Royal Mail literature). Maybe I need to contact Royal Mail to clear it up once and for all. We'll be receiving their latest PAF any day soon. – Dan Solo Jan 14 '13 at 15:43
  • 1
    Just had word back from the Royal Mail support team and my interpretation of the rules is correct apparently. A letter in the 4th position of the Outward Code (e.g. AANA NAA) has no exceptions and the 3rd position exceptions apply only to the last letter (e.g. ANA NAA). Straight from the horse's mouth. – Dan Solo Jan 16 '13 at 17:47
  • Good to know - you might want to update your answer with this info ;) – Zhaph - Ben Duguid Jan 16 '13 at 21:19
  • 1
    @DanSolo This regex will return a true match for the first half of a valid postcode missing the inward code e.g ``SW1A`` or ``BD25`` without the second half (or at least it did for me) – decvalts Aug 07 '15 at 13:19
10

Postcodes are subject to change, and the only true way of validating a postcode is to have the complete list of postcodes and see if it's there.

But regular expressions are useful because they:

  • are easy to use and implement
  • are short
  • are quick to run
  • are quite easy to maintain (compared to a full list of postcodes)
  • still catch most input errors

But regular expressions tend to be difficult to maintain, especially for someone who didn't come up with it in the first place. So it must be:

  • as easy to understand as possible
  • relatively future proof

That means that most of the regular expressions in this answer aren't good enough. E.g. I can see that [A-PR-UWYZ][A-HK-Y][0-9][ABEHMNPRV-Y] is going to match a postcode area of the form AA1A — but it's going to be a pain in the neck if and when a new postcode area gets added, because it's difficult to understand which postcode areas it matches.

I also want my regular expression to match the first and second half of the postcode as parenthesised matches.

So I've come up with this:

(GIR(?=\s*0AA)|(?:[BEGLMNSW]|[A-Z]{2})[0-9](?:[0-9]|(?<=N1|E1|SE1|SW1|W1|NW1|EC[0-9]|WC[0-9])[A-HJ-NP-Z])?)\s*([0-9][ABD-HJLNP-UW-Z]{2})

In PCRE format it can be written as follows:

/^
  ( GIR(?=\s*0AA) # Match the special postcode "GIR 0AA"
    |
    (?:
      [BEGLMNSW] | # There are 8 single-letter postcode areas
      [A-Z]{2}     # All other postcode areas have two letters
      )
    [0-9] # There is always at least one number after the postcode area
    (?:
      [0-9] # And an optional extra number
      |
      # Only certain postcode areas can have an extra letter after the number
      (?<=N1|E1|SE1|SW1|W1|NW1|EC[0-9]|WC[0-9])
      [A-HJ-NP-Z] # Possible letters here may change, but [IO] will never be used
      )?
    )
  \s*
  ([0-9][ABD-HJLNP-UW-Z]{2}) # The last two letters cannot be [CIKMOV]
$/x

For me this is the right balance between validating as much as possible, while at the same time future-proofing and allowing for easy maintenance.

andre
  • 1,861
  • 1
  • 16
  • 8
  • Not sure why you got voted down - this works with all the valid postcodes that I've thrown at it and spaces which a lot of the above answers do not handle correctly. Would anyone care to explain why? – Jon Aug 19 '14 at 11:23
  • 1
    @Jon It also matches when other characters are appended to the start or end e.g. ``aSW1A 1AAasfg`` matched for me (I didn't downvote though as it seems it could be fixed easily) – decvalts Aug 07 '15 at 13:00
8

Here's a regex based on the format specified in the documents which are linked to marcj's answer:

/^[A-Z]{1,2}[0-9][0-9A-Z]? ?[0-9][A-Z]{2}$/

The only difference between that and the specs is that the last 2 characters cannot be in [CIKMOV] according to the specs.

Edit: Here's another version which does test for the trailing character limitations.

/^[A-Z]{1,2}[0-9][0-9A-Z]? ?[0-9][A-BD-HJLNP-UW-Z]{2}$/
DMK
  • 2,448
  • 1
  • 24
  • 35
Will Tomlins
  • 1,436
  • 16
  • 12
  • There are a lot more complexities to a UK postcode than just accepting `A-Z` - `Q` is never allowed, `V` is only used sparingly, etc. depending on the position of the character. – Zhaph - Ben Duguid Jan 14 '13 at 14:04
  • 3
    That maybe irrelevant if what you want is a syntax check. As many others have remarked, only a lookup in an up-to-date database gets nearly correct, and even then there is the problem of how up-to-date the database is. So, for me, this syntax checker regex is clear, simple and useful. – Rick-777 Sep 05 '14 at 12:45
5

I wanted a simple regex, where it's fine to allow too much, but not to deny a valid postcode. I went with this (the input is a stripped/trimmed string):

/^([a-z0-9]\s*){5,8}$/i

This allows the shortest possible postcodes like "L1 8JQ" as well as the longest ones like "OL14 5ET".

Because it allows up to 8 characters, it will also allow incorrect 8 character postcodes if there is no space: "OL145ETX". But again, this is a simplistic regex, for when that's good enough.

Henrik N
  • 15,786
  • 5
  • 82
  • 131
5

Some of the regexs above are a little restrictive. Note the genuine postcode: "W1K 7AA" would fail given the rule "Position 3 - AEHMNPRTVXY only used" above as "K" would be disallowed.

the regex:

^(GIR 0AA|[A-PR-UWYZ]([0-9]{1,2}|([A-HK-Y][0-9]|[A-HK-Y][0-9]([0-9]|[ABEHMNPRV-Y]))|[0-9][A-HJKPS-UW])[0-9][ABD-HJLNP-UW-Z]{2})$

Seems a little more accurate, see the Wikipedia article entitled 'Postcodes in the United Kingdom'.

Note that this regex requires uppercase only characters.

The bigger question is whether you are restricting user input to allow only postcodes that actually exist or whether you are simply trying to stop users entering complete rubbish into the form fields. Correctly matching every possible postcode, and future proofing it, is a harder puzzle, and probably not worth it unless you are HMRC.

DMK
  • 2,448
  • 1
  • 24
  • 35
minglis
  • 59
  • 1
  • 1
  • Looks like the post office has moved on, but the government is lagging somewhat behind :( – Zhaph - Ben Duguid Jan 25 '11 at 12:10
  • 4
    I use this one: "^([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([A-Za-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z])))) {0,1}[0-9][A-Za-z]{2})$" I like it because it allows upper and lower cases and makes the space optional - better for usability, if not 100% correct! – bigtv Mar 22 '11 at 19:14
5

Whilst there are many answers here, I'm not happy with either of them. Most of them are simply broken, are too complex or just broken.

I looked at @ctwheels answer and I found it very explanatory and correct; we must thank him for that. However once again too much "data" for me, for something so simple.

Fortunately, I managed to get a database with over 1 million active postcodes for England only and made a small PowerShell script to test and benchmark the results.

UK Postcode specifications: Valid Postcode Format.

This is "my" Regex:

^([a-zA-Z]{1,2}[a-zA-Z\d]{1,2})\s(\d[a-zA-Z]{2})$

Short, simple and sweet. Even the most unexperienced can understand what is going on.

Explanation:

^ asserts position at start of a line
    1st Capturing Group ([a-zA-Z]{1,2}[a-zA-Z\d]{1,2})
        Match a single character present in the list below [a-zA-Z]
        {1,2} matches the previous token between 1 and 2 times, as many times as possible, giving back as needed (greedy)
        a-z matches a single character in the range between a (index 97) and z (index 122) (case sensitive)
        A-Z matches a single character in the range between A (index 65) and Z (index 90) (case sensitive)
        Match a single character present in the list below [a-zA-Z\d]
        {1,2} matches the previous token between 1 and 2 times, as many times as possible, giving back as needed (greedy)
        a-z matches a single character in the range between a (index 97) and z (index 122) (case sensitive)
        A-Z matches a single character in the range between A (index 65) and Z (index 90) (case sensitive)
        \d matches a digit (equivalent to [0-9])
        \s matches any whitespace character (equivalent to [\r\n\t\f\v ])
    2nd Capturing Group (\d[a-zA-Z]{2})
        \d matches a digit (equivalent to [0-9])
        Match a single character present in the list below [a-zA-Z]
        {2} matches the previous token exactly 2 times
        a-z matches a single character in the range between a (index 97) and z (index 122) (case sensitive)
        A-Z matches a single character in the range between A (index 65) and Z (index 90) (case sensitive)
$ asserts position at the end of a line

Result (postcodes checked):

TOTAL OK: 1469193
TOTAL FAILED: 0
-------------------------------------------------------------------------
Days              : 0
Hours             : 0
Minutes           : 5
Seconds           : 22
Milliseconds      : 718
Ticks             : 3227185939
TotalDays         : 0.00373516891087963
TotalHours        : 0.0896440538611111
TotalMinutes      : 5.37864323166667
TotalSeconds      : 322.7185939
TotalMilliseconds : 322718.5939
Mecanik
  • 1,539
  • 1
  • 20
  • 50
  • 1
    Thank you @Mecanik - this was just what I needed! I did have to make the whitespace optional for my implementation though: `^([a-zA-Z]{1,2}[a-zA-Z\d]{1,2})\s?(\d[a-zA-Z]{2})$` – Jon Humphrey Mar 29 '23 at 10:23
4

here's how we have been dealing with the UK postcode issue:

^([A-Za-z]{1,2}[0-9]{1,2}[A-Za-z]?[ ]?)([0-9]{1}[A-Za-z]{2})$

Explanation:

  • expect 1 or 2 a-z chars, upper or lower fine
  • expect 1 or 2 numbers
  • expect 0 or 1 a-z char, upper or lower fine
  • optional space allowed
  • expect 1 number
  • expect 2 a-z, upper or lower fine

This gets most formats, we then use the db to validate whether the postcode is actually real, this data is driven by openpoint https://www.ordnancesurvey.co.uk/opendatadownload/products.html

hope this helps

ʰᵈˑ
  • 11,279
  • 3
  • 26
  • 49
Alex Stephens
  • 3,017
  • 1
  • 36
  • 41
4

Basic rules:

^[A-Z]{1,2}[0-9R][0-9A-Z]? [0-9][ABD-HJLNP-UW-Z]{2}$

Postal codes in the U.K. (or postcodes, as they’re called) are composed of five to seven alphanumeric characters separated by a space. The rules covering which characters can appear at particular positions are rather complicated and fraught with exceptions. The regular expression just shown therefore sticks to the basic rules.

Complete rules:

If you need a regex that ticks all the boxes for the postcode rules at the expense of readability, here you go:

^(?:(?:[A-PR-UWYZ][0-9]{1,2}|[A-PR-UWYZ][A-HK-Y][0-9]{1,2}|[A-PR-UWYZ][0-9][A-HJKSTUW]|[A-PR-UWYZ][A-HK-Y][0-9][ABEHMNPRV-Y]) [0-9][ABD-HJLNP-UW-Z]{2}|GIR 0AA)$

Source: https://www.safaribooksonline.com/library/view/regular-expressions-cookbook/9781449327453/ch04s16.html

Tested against our customers database and seems perfectly accurate.

Raphos
  • 163
  • 7
4

I use the following regex that I have tested against all valid UK postcodes. It is based on the recommended rules, but condensed as much as reasonable and does not make use of any special language specific regex rules.

([A-PR-UWYZ]([A-HK-Y][0-9]([0-9]|[ABEHMNPRV-Y])?|[0-9]([0-9]|[A-HJKPSTUW])?) ?[0-9][ABD-HJLNP-UW-Z]{2})

It assumes that the postcode has been converted to uppercase and has not leading or trailing characters, but will accept an optional space between the outcode and incode.

The special "GIR0 0AA" postcode is excluded and will not validate as it's not in the official Post Office list of postcodes and as far as I'm aware will not be used as registered address. Adding it should be trivial as a special case if required.

Chisel
  • 41
  • 1
3

First half of postcode Valid formats

  • [A-Z][A-Z][0-9][A-Z]
  • [A-Z][A-Z][0-9][0-9]
  • [A-Z][0-9][0-9]
  • [A-Z][A-Z][0-9]
  • [A-Z][A-Z][A-Z]
  • [A-Z][0-9][A-Z]
  • [A-Z][0-9]

Exceptions
Position 1 - QVX not used
Position 2 - IJZ not used except in GIR 0AA
Position 3 - AEHMNPRTVXY only used
Position 4 - ABEHMNPRVWXY

Second half of postcode

  • [0-9][A-Z][A-Z]

Exceptions
Position 2+3 - CIKMOV not used

Remember not all possible codes are used, so this list is a necessary but not sufficent condition for a valid code. It might be easier to just match against a list of all valid codes?

Martin Beckett
  • 94,801
  • 28
  • 188
  • 263
3

To check a postcode is in a valid format as per the Royal Mail's programmer's guide:

          |----------------------------outward code------------------------------| |------inward code-----|
#special↓       α1        α2    AAN  AANA      AANN      AN    ANN    ANA (α3)        N         AA
^(GIR 0AA|[A-PR-UWYZ]([A-HK-Y]([0-9][A-Z]?|[1-9][0-9])|[1-9]([0-9]|[A-HJKPSTUW])?) [0-9][ABD-HJLNP-UW-Z]{2})$

All postcodes on doogal.co.uk match, except for those no longer in use.

Adding a ? after the space and using case-insensitive match to answer this question:

'se50eg'.match(/^(GIR 0AA|[A-PR-UWYZ]([A-HK-Y]([0-9][A-Z]?|[1-9][0-9])|[1-9]([0-9]|[A-HJKPSTUW])?) ?[0-9][ABD-HJLNP-UW-Z]{2})$/ig);
Array [ "se50eg" ]
Jackson Pauls
  • 225
  • 2
  • 12
3

This one allows empty spaces and tabs from both sides in case you don't want to fail validation and then trim it sever side.

^\s*(([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([A-Za-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z])))) {0,1}[0-9][A-Za-z]{2})\s*$)
Matas Vaitkevicius
  • 58,075
  • 31
  • 238
  • 265
3

Through empirical testing and observation, as well as confirming with https://en.wikipedia.org/wiki/Postcodes_in_the_United_Kingdom#Validation, here is my version of a Python regex that correctly parses and validates a UK postcode:

UK_POSTCODE_REGEX = r'(?P<postcode_area>[A-Z]{1,2})(?P<district>(?:[0-9]{1,2})|(?:[0-9][A-Z]))(?P<sector>[0-9])(?P<postcode>[A-Z]{2})'

This regex is simple and has capture groups. It does not include all of the validations of legal UK postcodes, but only takes into account the letter vs number positions.

Here is how I would use it in code:

@dataclass
class UKPostcode:
    postcode_area: str
    district: str
    sector: int
    postcode: str

    # https://en.wikipedia.org/wiki/Postcodes_in_the_United_Kingdom#Validation
    # Original author of this regex: @jontsai
    # NOTE TO FUTURE DEVELOPER:
    # Verified through empirical testing and observation, as well as confirming with the Wiki article
    # If this regex fails to capture all valid UK postcodes, then I apologize, for I am only human.
    UK_POSTCODE_REGEX = r'(?P<postcode_area>[A-Z]{1,2})(?P<district>(?:[0-9]{1,2})|(?:[0-9][A-Z]))(?P<sector>[0-9])(?P<postcode>[A-Z]{2})'

    @classmethod
    def from_postcode(cls, postcode):
        """Parses a string into a UKPostcode

        Returns a UKPostcode or None
        """
        m = re.match(cls.UK_POSTCODE_REGEX, postcode.replace(' ', ''))

        if m:
            uk_postcode = UKPostcode(
                postcode_area=m.group('postcode_area'),
                district=m.group('district'),
                sector=m.group('sector'),
                postcode=m.group('postcode')
            )
        else:
            uk_postcode = None

        return uk_postcode


def parse_uk_postcode(postcode):
    """Wrapper for UKPostcode.from_postcode
    """
    uk_postcode = UKPostcode.from_postcode(postcode)
    return uk_postcode

Here are unit tests:

@pytest.mark.parametrize(
    'postcode, expected', [
        # https://en.wikipedia.org/wiki/Postcodes_in_the_United_Kingdom#Validation
        (
            'EC1A1BB',
            UKPostcode(
                postcode_area='EC',
                district='1A',
                sector='1',
                postcode='BB'
            ),
        ),
        (
            'W1A0AX',
            UKPostcode(
                postcode_area='W',
                district='1A',
                sector='0',
                postcode='AX'
            ),
        ),
        (
            'M11AE',
            UKPostcode(
                postcode_area='M',
                district='1',
                sector='1',
                postcode='AE'
            ),
        ),
        (
            'B338TH',
            UKPostcode(
                postcode_area='B',
                district='33',
                sector='8',
                postcode='TH'
            )
        ),
        (
            'CR26XH',
            UKPostcode(
                postcode_area='CR',
                district='2',
                sector='6',
                postcode='XH'
            )
        ),
        (
            'DN551PT',
            UKPostcode(
                postcode_area='DN',
                district='55',
                sector='1',
                postcode='PT'
            )
        )
    ]
)
def test_parse_uk_postcode(postcode, expected):
    uk_postcode = parse_uk_postcode(postcode)
    assert(uk_postcode == expected)
jontsai
  • 682
  • 1
  • 6
  • 13
2

To add to this list a more practical regex that I use that allows the user to enter an empty string is:

^$|^(([gG][iI][rR] {0,}0[aA]{2})|((([a-pr-uwyzA-PR-UWYZ][a-hk-yA-HK-Y]?[0-9][0-9]?)|(([a-pr-uwyzA-PR-UWYZ][0-9][a-hjkstuwA-HJKSTUW])|([a-pr-uwyzA-PR-UWYZ][a-hk-yA-HK-Y][0-9][abehmnprv-yABEHMNPRV-Y]))) {0,1}[0-9][abd-hjlnp-uw-zABD-HJLNP-UW-Z]{2}))$

This regex allows capital and lower case letters with an optional space in between

From a software developers point of view this regex is useful for software where an address may be optional. For example if a user did not want to supply their address details

JKennedy
  • 18,150
  • 17
  • 114
  • 198
1

I have the regex for UK Postcode validation.

This is working for all type of Postcode either inner or outer

^((([A-PR-UWYZ][0-9])|([A-PR-UWYZ][0-9][0-9])|([A-PR-UWYZ][A-HK-Y][0-9])|([A-PR-UWYZ][A-HK-Y][0-9][0-9])|([A-PR-UWYZ][0-9][A-HJKSTUW])|([A-PR-UWYZ][A-HK-Y][0-9][ABEHMNPRVWXY]))) || ^((GIR)[ ]?(0AA))$|^(([A-PR-UWYZ][0-9])[ ]?([0-9][ABD-HJLNPQ-UW-Z]{0,2}))$|^(([A-PR-UWYZ][0-9][0-9])[ ]?([0-9][ABD-HJLNPQ-UW-Z]{0,2}))$|^(([A-PR-UWYZ][A-HK-Y0-9][0-9])[ ]?([0-9][ABD-HJLNPQ-UW-Z]{0,2}))$|^(([A-PR-UWYZ][A-HK-Y0-9][0-9][0-9])[ ]?([0-9][ABD-HJLNPQ-UW-Z]{0,2}))$|^(([A-PR-UWYZ][0-9][A-HJKS-UW0-9])[ ]?([0-9][ABD-HJLNPQ-UW-Z]{0,2}))$|^(([A-PR-UWYZ][A-HK-Y0-9][0-9][ABEHMNPRVWXY0-9])[ ]?([0-9][ABD-HJLNPQ-UW-Z]{0,2}))$

This is working for all type of format.

Example:

AB10-------------------->ONLY OUTER POSTCODE

A1 1AA------------------>COMBINATION OF (OUTER AND INNER) POSTCODE

WC2A-------------------->OUTER

Peter O.
  • 32,158
  • 14
  • 82
  • 96
Vikas Pandey
  • 550
  • 4
  • 13
1

Have a look at the python code on this page:

http://www.brunningonline.net/simon/blog/archives/001292.html

I've got some postcode parsing to do. The requirement is pretty simple; I have to parse a postcode into an outcode and (optional) incode. The good new is that I don't have to perform any validation - I just have to chop up what I've been provided with in a vaguely intelligent manner. I can't assume much about my import in terms of formatting, i.e. case and embedded spaces. But this isn't the bad news; the bad news is that I have to do it all in RPG. :-(

Nevertheless, I threw a little Python function together to clarify my thinking.

I've used it to process postcodes for me.

Community
  • 1
  • 1
Rudiger Wolf
  • 1,760
  • 12
  • 15
0

We were given a spec:

UK postcodes must be in one of the following forms (with one exception, see below): 
    § A9 9AA 
    § A99 9AA
    § AA9 9AA
    § AA99 9AA
    § A9A 9AA
    § AA9A 9AA
where A represents an alphabetic character and 9 represents a numeric character.
Additional rules apply to alphabetic characters, as follows:
    § The character in position 1 may not be Q, V or X
    § The character in position 2 may not be I, J or Z
    § The character in position 3 may not be I, L, M, N, O, P, Q, R, V, X, Y or Z
    § The character in position 4 may not be C, D, F, G, I, J, K, L, O, Q, S, T, U or Z
    § The characters in the rightmost two positions may not be C, I, K, M, O or V
The one exception that does not follow these general rules is the postcode "GIR 0AA", which is a special valid postcode.

We came up with this:

/^([A-PR-UWYZ][A-HK-Y0-9](?:[A-HJKS-UW0-9][ABEHMNPRV-Y0-9]?)?\s*[0-9][ABD-HJLNP-UW-Z]{2}|GIR\s*0AA)$/i

But note - this allows any number of spaces in between groups.

paulslater19
  • 5,869
  • 1
  • 28
  • 25
  • 2
    paulslater19, unfortunately your solution allows A99A 9AA postcodes. –  Nov 26 '12 at 16:59
0

The accepted answer reflects the rules given by Royal Mail, although there is a typo in the regex. This typo seems to have been in there on the gov.uk site as well (as it is in the XML archive page).

In the format A9A 9AA the rules allow a P character in the third position, whilst the regex disallows this. The correct regex would be:

(GIR 0AA)|((([A-Z-[QVX]][0-9][0-9]?)|(([A-Z-[QVX]][A-Z-[IJZ]][0-9][0-9]?)|(([A-Z-[QVX]][0-9][A-HJKPSTUW])|([A-Z-[QVX]][A-Z-[IJZ]][0-9][ABEHMNPRVWXY])))) [0-9][A-Z-[CIKMOV]]{2}) 

Shortening this results in the following regex (which uses Perl/Ruby syntax):

(GIR 0AA)|([A-PR-UWYZ](([0-9]([0-9A-HJKPSTUW])?)|([A-HK-Y][0-9]([0-9ABEHMNPRVWXY])?))\s?[0-9][ABD-HJLNP-UW-Z]{2})

It also includes an optional space between the first and second block.

Stieb
  • 11
  • 4
0

What i have found in nearly all the variations and the regex from the bulk transfer pdf and what is on wikipedia site is this, specifically for the wikipedia regex is, there needs to be a ^ after the first |(vertical bar). I figured this out by testing for AA9A 9AA, because otherwise the format check for A9A 9AA will validate it. For Example checking for EC1D 1BB which should be invalid comes back valid because C1D 1BB is a valid format.

Here is what I've come up with for a good regex:

^([G][I][R] 0[A]{2})|^((([A-Z-[QVX]][0-9]{1,2})|([A-Z-[QVX]][A-HK-Y][0-9]{1,2})|([A-Z-[QVX]][0-9][ABCDEFGHJKPSTUW])|([A-Z-[QVX]][A-HK-Y][0-9][ABEHMNPRVWXY])) [0-9][A-Z-[CIKMOV]]{2})$
0

Below method will check the post code and provide complete info

const isValidUKPostcode = postcode => {
    try {
        postcode = postcode.replace(/\s/g, "");
        const fromat = postcode
            .toUpperCase()
            .match(/^([A-Z]{1,2}\d{1,2}[A-Z]?)\s*(\d[A-Z]{2})$/);
        const finalValue = `${fromat[1]} ${fromat[2]}`;
        const regex = /^([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([AZa-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z]))))[0-9][A-Za-z]{2})$/i;
        return {
            isValid: regex.test(postcode),
            formatedPostCode: finalValue,
            error: false,
            message: 'It is a valid postcode'
        };
    } catch (error) {
        return { error: true , message: 'Invalid postcode'};
    }
};
console.log(isValidUKPostcode('GU348RR'))
{isValid: true, formattedPostcode: "GU34 8RR", error: false, message: "It is a valid postcode"}
console.log(isValidUKPostcode('sdasd4746asd'))
{error: true, message: "Invalid postcode!"}
valid_postcode('787898523')
result => {error: true, message: "Invalid postcode"}
Aathi
  • 2,599
  • 2
  • 19
  • 16
-1

I needed a version that would work in SAS with the PRXMATCH and related functions, so I came up with this:

^[A-PR-UWYZ](([A-HK-Y]?\d\d?)|(\d[A-HJKPSTUW])|([A-HK-Y]\d[ABEHMNPRV-Y]))\s?\d[ABD-HJLNP-UW-Z]{2}$

Test cases and notes:

/* 
Notes
The letters QVX are not used in the 1st position.
The letters IJZ are not used in the second position.
The only letters to appear in the third position are ABCDEFGHJKPSTUW when the structure starts with A9A.
The only letters to appear in the fourth position are ABEHMNPRVWXY when the structure starts with AA9A.
The final two letters do not use the letters CIKMOV, so as not to resemble digits or each other when hand-written.
*/

/*
    Bits and pieces
    1st position (any):         [A-PR-UWYZ]         
    2nd position (if letter):   [A-HK-Y]
    3rd position (A1A format):  [A-HJKPSTUW]
    4th position (AA1A format): [ABEHMNPRV-Y]
    Last 2 positions:           [ABD-HJLNP-UW-Z]    
*/


data example;
infile cards truncover;
input valid 1. postcode &$10. Notes &$100.;
flag = prxmatch('/^[A-PR-UWYZ](([A-HK-Y]?\d\d?)|(\d[A-HJKPSTUW])|([A-HK-Y]\d[ABEHMNPRV-Y]))\s?\d[ABD-HJLNP-UW-Z]{2}$/',strip(postcode));
cards;
1  EC1A 1BB  Special case 1
1  W1A 0AX   Special case 2
1  M1 1AE    Standard format
1  B33 8TH   Standard format
1  CR2 6XH   Standard format
1  DN55 1PT  Standard format
0  QN55 1PT  Bad letter in 1st position
0  DI55 1PT  Bad letter in 2nd position
0  W1Z 0AX   Bad letter in 3rd position
0  EC1Z 1BB  Bad letter in 4th position
0  DN55 1CT  Bad letter in 2nd group
0  A11A 1AA  Invalid digits in 1st group
0  AA11A 1AA  1st group too long
0  AA11 1AAA  2nd group too long
0  AA11 1AAA  2nd group too long
0  AAA 1AA   No digit in 1st group
0  AA 1AA    No digit in 1st group
0  A 1AA     No digit in 1st group
0  1A 1AA    Missing letter in 1st group
0  1 1AA     Missing letter in 1st group
0  11 1AA    Missing letter in 1st group
0  AA1 1A    Missing letter in 2nd group
0  AA1 1     Missing letter in 2nd group
;
run;
user667489
  • 9,501
  • 2
  • 24
  • 35
-1

I stole this from an XML document and it seems to cover all cases without the hard coded GIRO:

%r{[A-Z]{1,2}[0-9R][0-9A-Z]? [0-9][A-Z]{2}}i

(Ruby syntax with ignore case)

Ghoti
  • 2,388
  • 1
  • 18
  • 22
-1

I did the regex for UK postcode validation today, as far as I know, it works for all UK postcodes, it works if you put a space or if you don't.

^((([a-zA-Z][0-9])|([a-zA-Z][0-9]{2})|([a-zA-Z]{2}[0-9])|([a-zA-Z]{2}[0-9]{2})|([A-Za-z][0-9][a-zA-Z])|([a-zA-Z]{2}[0-9][a-zA-Z]))(\s*[0-9][a-zA-Z]{2})$)

Let me know if there's a format it doesn't cover

m4n0
  • 29,823
  • 27
  • 76
  • 89
  • 1
    @Mecanik... If you're going to be critical then it is a requirement that you detail your issues. That would allow everyone to learn from your input. – Monty Jan 15 '22 at 12:17