What is the ultimate postal code and zip regex?

Question

I'm looking for the ultimate postal code and zip code regex. I'm looking for something that will cover most (hopefully all) of the world.

One single regex for all postal codes would be useless for most cases, not to mention requiring a lot of unicode encoding. Much better is to check regex on a country-by-country basis so that you don't validate things like "New York, NY AF23Q" as correct. — Yes - that Jake., Feb 23 '09 at 17:05
http://regexlib.com/Search.aspx?k=decimal&c=3&m=-1&ps=100 for validating a field go here — Dinesh Kumar, Feb 06 '10 at 05:35
You have a problem. You write a regex for it. Now you have two problems. — Robert S., Feb 23 '09 at 17:38

score 335 · Answer 1 · edited Apr 06 '20 at 13:58

The unicode CLDR contains the postal code regex for each country. (158 regex's in total!)

Download core.zip from http://unicode.org/Public/cldr/26.0.1/
unzip core.zip
Take a look at common/supplemental/postalCodeData.xml from the unzipped content (direct content: common/supplemental/postalCodeData.xml)

Google also has a web service with per-country address formatting information, including postal codes, here - http://i18napis.appspot.com/address (I found that link via http://unicode.org/review/pri180/ )

Edit

Here a copy of postalCodeData.xml regex :

"GB", "GIR[ ]?0AA|((AB|AL|B|BA|BB|BD|BH|BL|BN|BR|BS|BT|CA|CB|CF|CH|CM|CO|CR|CT|CV|CW|DA|DD|DE|DG|DH|DL|DN|DT|DY|E|EC|EH|EN|EX|FK|FY|G|GL|GY|GU|HA|HD|HG|HP|HR|HS|HU|HX|IG|IM|IP|IV|JE|KA|KT|KW|KY|L|LA|LD|LE|LL|LN|LS|LU|M|ME|MK|ML|N|NE|NG|NN|NP|NR|NW|OL|OX|PA|PE|PH|PL|PO|PR|RG|RH|RM|S|SA|SE|SG|SK|SL|SM|SN|SO|SP|SR|SS|ST|SW|SY|TA|TD|TF|TN|TQ|TR|TS|TW|UB|W|WA|WC|WD|WF|WN|WR|WS|WV|YO|ZE)(\d[\dA-Z]?[ ]?\d[ABD-HJLN-UW-Z]{2}))|BFPO[ ]?\d{1,4}"
"JE", "JE\d[\dA-Z]?[ ]?\d[ABD-HJLN-UW-Z]{2}"
"GG", "GY\d[\dA-Z]?[ ]?\d[ABD-HJLN-UW-Z]{2}"
"IM", "IM\d[\dA-Z]?[ ]?\d[ABD-HJLN-UW-Z]{2}"
"US", "\d{5}([ \-]\d{4})?"
"CA", "[ABCEGHJKLMNPRSTVXY]\d[ABCEGHJ-NPRSTV-Z][ ]?\d[ABCEGHJ-NPRSTV-Z]\d"
"DE", "\d{5}"
"JP", "\d{3}-\d{4}"
"FR", "\d{2}[ ]?\d{3}"
"AU", "\d{4}"
"IT", "\d{5}"
"CH", "\d{4}"
"AT", "\d{4}"
"ES", "\d{5}"
"NL", "\d{4}[ ]?[A-Z]{2}"
"BE", "\d{4}"
"DK", "\d{4}"
"SE", "\d{3}[ ]?\d{2}"
"NO", "\d{4}"
"BR", "\d{5}[\-]?\d{3}"
"PT", "\d{4}([\-]\d{3})?"
"FI", "\d{5}"
"AX", "22\d{3}"
"KR", "\d{3}[\-]\d{3}"
"CN", "\d{6}"
"TW", "\d{3}(\d{2})?"
"SG", "\d{6}"
"DZ", "\d{5}"
"AD", "AD\d{3}"
"AR", "([A-HJ-NP-Z])?\d{4}([A-Z]{3})?"
"AM", "(37)?\d{4}"
"AZ", "\d{4}"
"BH", "((1[0-2]|[2-9])\d{2})?"
"BD", "\d{4}"
"BB", "(BB\d{5})?"
"BY", "\d{6}"
"BM", "[A-Z]{2}[ ]?[A-Z0-9]{2}"
"BA", "\d{5}"
"IO", "BBND 1ZZ"
"BN", "[A-Z]{2}[ ]?\d{4}"
"BG", "\d{4}"
"KH", "\d{5}"
"CV", "\d{4}"
"CL", "\d{7}"
"CR", "\d{4,5}|\d{3}-\d{4}"
"HR", "\d{5}"
"CY", "\d{4}"
"CZ", "\d{3}[ ]?\d{2}"
"DO", "\d{5}"
"EC", "([A-Z]\d{4}[A-Z]|(?:[A-Z]{2})?\d{6})?"
"EG", "\d{5}"
"EE", "\d{5}"
"FO", "\d{3}"
"GE", "\d{4}"
"GR", "\d{3}[ ]?\d{2}"
"GL", "39\d{2}"
"GT", "\d{5}"
"HT", "\d{4}"
"HN", "(?:\d{5})?"
"HU", "\d{4}"
"IS", "\d{3}"
"IN", "\d{6}"
"ID", "\d{5}"
"IL", "\d{5}"
"JO", "\d{5}"
"KZ", "\d{6}"
"KE", "\d{5}"
"KW", "\d{5}"
"LA", "\d{5}"
"LV", "\d{4}"
"LB", "(\d{4}([ ]?\d{4})?)?"
"LI", "(948[5-9])|(949[0-7])"
"LT", "\d{5}"
"LU", "\d{4}"
"MK", "\d{4}"
"MY", "\d{5}"
"MV", "\d{5}"
"MT", "[A-Z]{3}[ ]?\d{2,4}"
"MU", "(\d{3}[A-Z]{2}\d{3})?"
"MX", "\d{5}"
"MD", "\d{4}"
"MC", "980\d{2}"
"MA", "\d{5}"
"NP", "\d{5}"
"NZ", "\d{4}"
"NI", "((\d{4}-)?\d{3}-\d{3}(-\d{1})?)?"
"NG", "(\d{6})?"
"OM", "(PC )?\d{3}"
"PK", "\d{5}"
"PY", "\d{4}"
"PH", "\d{4}"
"PL", "\d{2}-\d{3}"
"PR", "00[679]\d{2}([ \-]\d{4})?"
"RO", "\d{6}"
"RU", "\d{6}"
"SM", "4789\d"
"SA", "\d{5}"
"SN", "\d{5}"
"SK", "\d{3}[ ]?\d{2}"
"SI", "\d{4}"
"ZA", "\d{4}"
"LK", "\d{5}"
"TJ", "\d{6}"
"TH", "\d{5}"
"TN", "\d{4}"
"TR", "\d{5}"
"TM", "\d{6}"
"UA", "\d{5}"
"UY", "\d{5}"
"UZ", "\d{6}"
"VA", "00120"
"VE", "\d{4}"
"ZM", "\d{5}"
"AS", "96799"
"CC", "6799"
"CK", "\d{4}"
"RS", "\d{6}"
"ME", "8\d{4}"
"CS", "\d{5}"
"YU", "\d{5}"
"CX", "6798"
"ET", "\d{4}"
"FK", "FIQQ 1ZZ"
"NF", "2899"
"FM", "(9694[1-4])([ \-]\d{4})?"
"GF", "9[78]3\d{2}"
"GN", "\d{3}"
"GP", "9[78][01]\d{2}"
"GS", "SIQQ 1ZZ"
"GU", "969[123]\d([ \-]\d{4})?"
"GW", "\d{4}"
"HM", "\d{4}"
"IQ", "\d{5}"
"KG", "\d{6}"
"LR", "\d{4}"
"LS", "\d{3}"
"MG", "\d{3}"
"MH", "969[67]\d([ \-]\d{4})?"
"MN", "\d{6}"
"MP", "9695[012]([ \-]\d{4})?"
"MQ", "9[78]2\d{2}"
"NC", "988\d{2}"
"NE", "\d{4}"
"VI", "008(([0-4]\d)|(5[01]))([ \-]\d{4})?"
"PF", "987\d{2}"
"PG", "\d{3}"
"PM", "9[78]5\d{2}"
"PN", "PCRN 1ZZ"
"PW", "96940"
"RE", "9[78]4\d{2}"
"SH", "(ASCN|STHL) 1ZZ"
"SJ", "\d{4}"
"SO", "\d{5}"
"SZ", "[HLMS]\d{3}"
"TC", "TKCA 1ZZ"
"WF", "986\d{2}"
"XK", "\d{5}"
"YT", "976\d{2}"

Just with a quick scan of the AU postcode-regex... this regex is very simple and will allow lots of false-positives through, so it's not exhaustive. — Taryn East, Aug 27 '14 at 07:49
The latest version of unicode CLDR containing the postal code regex is version 26.0.1. In later versions it has been removed because the data was not maintained and no other reliable sources could be found. — KIKO Software, Feb 18 '16 at 12:15
Same, very basic for french Zip code regex. Use this one "^((0[1-9])|([1-8][0-9])|(9[0-8])|(2A)|(2B))[0-9]{3}$" -> http://www.developpez.net/forums/d518232/webmasters-developpement-web/javascript-ajax-typescript-dart/javascript/regexp-code-postal-cette/ — Vincent D., Jun 13 '16 at 18:39
I'm using https://i18napis.appspot.com/address/data/GB now; are there any problems with this service? — mgol, Jul 11 '16 at 13:34
Small correction to @kiko-software's comment: the latest version containing postal code data is [27.0.3](https://github.com/unicode-cldr/cldr-core/tree/27.0.3/supplemental). — Sietse, Oct 27 '16 at 09:29
I wondered if someone else might have the same problem like me. In order to make these regexes work correctly in my js array I had to modify for example Italy: "IT", "\d{5}" into "IT": /^\d{5}$/, without second"" and with , . This works as expected but it doesn't with stuff like: ([A-HJ-NP-Z])?\d{4}([A-Z]{3})? Question mark at the end or similar things. Someone out there who managed this?Anyhow thanks for them! — Garavani, Apr 09 '17 at 15:47
Nicaragua (NI) appears to be incorrect. Should just be 5 digits. — dngrmice, Oct 01 '21 at 16:55
@Garavani you are correct. We (I) need to match the exact string; hence added `^(REGEX)$` and it works well for me. — Vaishak, Nov 10 '21 at 15:11

score 146 · Accepted Answer · answered Feb 23 '09 at 17:10

146

There is none.

Postal/zip codes around the world don't follow a common pattern. In some countries they are made up by numbers, in others they can be combinations of numbers an letters, some can contain spaces, others dots, the number of characters can vary from two to at least six...

What you could do (theoretically) is create a seperate regex for every country in the world, not recommendable IMO. But you would still be missing on the validation part: Zip code 12345 may exist, but 12346 not, maybe 12344 doesn't exist either. How do you check for that with a regex?

You can't.

answered Feb 23 '09 at 17:10

Treb

19,903
7
54
87

I suspect that a regex could be compiled, but that a task like this be much better suited to a database. The regex would look something like 10000|10001|10002|10003|....... – Kibbee Feb 23 '09 at 17:17
for validating a field go here http://regexlib.com/Search.aspx?k=decimal&c=3&m=-1&ps=100 – Dinesh Kumar Feb 06 '10 at 05:36
You can use a regexp first that matches your country (see http://en.wikipedia.org/wiki/List_of_postal_codes) and do a real check by an external service like http://www.geonames.org/export/ws-overview.html – SimonSimCity Sep 21 '11 at 14:51
3

My two cents: in Brazil it is actualy 8 numbers, 5 followed by a dash and 3 more – Jorge Campos Oct 16 '15 at 02:28
`^\d{5}(?:[-\s]\d{4})?$` – Aamir Afridi Mar 09 '17 at 14:47
It would be useful for many types of software to have a per-country regex. One example: if you are taking GUI input and want to validate in real time if it's a valid zip code, you wait until your text entry field matches the regex, then you validate against a more fine-grained database. – markgalassi Feb 07 '18 at 05:06
dude this is the best answer without a solution I have ever seen – Skykid Felix Apr 20 '20 at 08:23

score 92 · Answer 3 · edited Jul 27 '15 at 14:36

92

use these regx

$ZIPREG=array(
    "US"=>"^\d{5}([\-]?\d{4})?$",
    "UK"=>"^(GIR|[A-Z]\d[A-Z\d]??|[A-Z]{2}\d[A-Z\d]??)[ ]??(\d[A-Z]{2})$",
    "DE"=>"\b((?:0[1-46-9]\d{3})|(?:[1-357-9]\d{4})|(?:[4][0-24-9]\d{3})|(?:[6][013-9]\d{3}))\b",
    "CA"=>"^([ABCEGHJKLMNPRSTVXY]\d[ABCEGHJKLMNPRSTVWXYZ])\ {0,1}(\d[ABCEGHJKLMNPRSTVWXYZ]\d)$",
    "FR"=>"^(F-)?((2[A|B])|[0-9]{2})[0-9]{3}$",
    "IT"=>"^(V-|I-)?[0-9]{5}$",
    "AU"=>"^(0[289][0-9]{2})|([1345689][0-9]{3})|(2[0-8][0-9]{2})|(290[0-9])|(291[0-4])|(7[0-4][0-9]{2})|(7[8-9][0-9]{2})$",
    "NL"=>"^[1-9][0-9]{3}\s?([a-zA-Z]{2})?$",
    "ES"=>"^([1-9]{2}|[0-9][1-9]|[1-9][0-9])[0-9]{3}$",
    "DK"=>"^([D|d][K|k]( |-))?[1-9]{1}[0-9]{3}$",
    "SE"=>"^(s-|S-){0,1}[0-9]{3}\s?[0-9]{2}$",
    "BE"=>"^[1-9]{1}[0-9]{3}$",
    "IN"=>"^\d{6}$"
);

edited Jul 27 '15 at 14:36

adnan

1,385
2
17
31

answered May 10 '12 at 07:15

neeraj t

4,654
2
27
30

7

One of the better attempts I've seen to actually answer the OP. Get's slower as you ad more but a clean and clear approach. – Rob May 07 '13 at 17:48
5

It does not get slower as you add more as Rob suggests as you would choose one of the regexes from the country code. – Thomaschaaf Feb 26 '14 at 19:30
2

I see you posted this in 2012. Got any more since? – rybo111 May 19 '14 at 09:06
@rybo111 check Chi answer. – Giulio Caccin May 04 '15 at 12:56
Would the US version accept 00000 with your code, though? – ddunn801 Jun 21 '16 at 14:10
@ddunn801 it's a matter of how far you want to go. Validating a US ZIP code as simply "any five digits" is good enough in some cases. On the other hand, for some applications maybe you need to check against a list of all valid ZIP codes, or even check that the ZIP matches other address info. – Jun 12 '17 at 15:57
6

@ddunn801, there's a (whomping big) differencee between validating the pattern and authenticating the postal code. Authenticating the codes is whole orders of magnitude more difficult since (at least in the U.S.) postal codes are added and dropped regularly. In an ideal world, you would perform a quick-check to validate the pattern before submitting to a service (e.g., USPS) to validate the entire mailing address (services like this are paid, you'd hate to waste the value with bad data). Alas, the world is far from ideal. – JBH Oct 26 '17 at 20:50
Just for future reference, the Dutch (NL) regex is wrong. The letter part is made optional, but it's not; it should be required. Correct would be `^[1-9][0-9]{3}\s?([a-zA-Z]{2})$`, so without trailing `?` quantifier. – Sander van Leeuwen Sep 03 '20 at 13:53

Neil McGuigan · Answer 4 · 2015-04-30T07:32:40.813

88

Every postal code system uses only A-Z and/or 0-9 and sometimes space/dash
Not every country uses postal codes (ex. Ireland outside of Dublin), but we'll ignore that here.
The shortest postal code format is Sierra Leone with NN
The longest is American Samoa with NNNNN-NNNNNN
You should allow one space or dash.
Should not begin or end with space or dash

This should cover the above:

(?i)^[a-z0-9][a-z0-9\- ]{0,10}[a-z0-9]$

edited Apr 30 '15 at 07:32

answered Nov 07 '13 at 19:00

Neil McGuigan

46,580
12
123
152

16

This seems to be the only answer that provides a sanity check (which is probably what the OP wanted) rather than a full validation of every possibly combination. Exactly what I wanted thx – Lukos Apr 29 '15 at 10:40
Still it's important to remember that it's a bad practice, if you want to validate better see @Chi answer. – Giulio Caccin May 04 '15 at 12:52
1

@GiulioCaccin H0H0H0 is a valid Canadian Postal Code (which children use to get letters from Canada Post pretending to be Santa Claus), but that doesn't mean it's a valid customer postal code :) – Neil McGuigan Nov 08 '17 at 23:14
Not bad I liked it but for example: 9743 PS is not working (thats groningen, NL) – Philipp Mochine Mar 21 '18 at 11:56
2

FYI, American Samoa is small enough to only has one postcode and it's 96799 – naterkane May 17 '18 at 19:17
7

In my opinion this is the only good answer. It can universally be used as pre-validation in HTML pattern attribute for instance. – Blackbam Nov 09 '18 at 18:55
2

I think this is a good answer for the situation where one just wants to have a sanity check and not validate precisly per country. Just to have a little cleaner data without much effort -- in cases where full safety is needed, a third party plugin/service might be needed as others pointed out. – Yo Ludke Aug 23 '19 at 09:14
@NeilMcGuigan This regular expression is not working in Safari Browser – Sanjay Khatri Oct 09 '19 at 13:48
4

For Javascript, remove the "(?i) as it does not conform to ECMA script. you can use this. ```^[a-z0-9][a-z0-9\- ]{0,10}[a-z0-9]$``` – Emmanuel Aliji Dec 15 '20 at 09:08

score 17 · Answer 5 · edited Jun 19 '12 at 18:29

Trying to cover the whole world with one regular expression is not completely possible, and certainly not feasible or recommended.

Not to toot my own horn, but I've written some pretty thorough regular expressions which you may find helpful.

Canadian postal codes

Basic validation:
^[ABCEGHJ-NPRSTVXY]{1}[0-9]{1}[ABCEGHJ-NPRSTV-Z]{1}[ ]?[0-9]{1}[ABCEGHJ-NPRSTV-Z]{1}[0-9]{1}$

Extended validation:
^(A(0[ABCEGHJ-NPR]|1[ABCEGHK-NSV-Y]|2[ABHNV]|5[A]|8[A])|B(0[CEHJ-NPRSTVW]|1[ABCEGHJ-NPRSTV-Y]|2[ABCEGHJNRSTV-Z]|3[ABEGHJ-NPRSTVZ]|4[ABCEGHNPRV]|5[A]|6[L]|9[A])|C(0[AB]|1[ABCEN])|E(1[ABCEGHJNVWX]|2[AEGHJ-NPRSV]|3[ABCELNVYZ]|4[ABCEGHJ-NPRSTV-Z]|5[ABCEGHJ-NPRSTV]|6[ABCEGHJKL]|7[ABCEGHJ-NP]|8[ABCEGJ-NPRST]|9[ABCEGH])|G(0[ACEGHJ-NPRSTV-Z]|1[ABCEGHJ-NPRSTV-Y]|2[ABCEGJ-N]|3[ABCEGHJ-NZ]|4[ARSTVWXZ]|5[ABCHJLMNRTVXYZ]|6[ABCEGHJKLPRSTVWXZ]|7[ABGHJKNPSTXYZ]|8[ABCEGHJ-NPTVWYZ]|9[ABCHNPRTX])|H(0[HM]|1[ABCEGHJ-NPRSTV-Z]|2[ABCEGHJ-NPRSTV-Z]|3[ABCEGHJ-NPRSTV-Z]|4[ABCEGHJ-NPRSTV-Z]|5[AB]|7[ABCEGHJ-NPRSTV-Y]|8[NPRSTYZ]|9[ABCEGHJKPRSWX])|J(0[ABCEGHJ-NPRSTV-Z]|1[ACEGHJ-NRSTXZ]|2[ABCEGHJ-NRSTWXY]|3[ABEGHLMNPRTVXYZ]|4[BGHJ-NPRSTV-Z]|5[ABCJ-MRTV-Z]|6[AEJKNRSTVWYXZ]|7[ABCEGHJ-NPRTV-Z]|8[ABCEGHLMNPRTVXYZ]|9[ABEHJLNTVXYZ])|K(0[ABCEGHJ-M]|1[ABCEGHJ-NPRSTV-Z]|2[ABCEGHJ-MPRSTVW]|4[ABCKMPR]|6[AHJKTV]|7[ACGHK-NPRSV]|8[ABHNPRV]|9[AHJKLV])|L(0[[ABCEGHJ-NPRS]]|1[ABCEGHJ-NPRSTV-Z]|2[AEGHJMNPRSTVW]|3[BCKMPRSTVXYZ]|4[ABCEGHJ-NPRSTV-Z]|5[ABCEGHJ-NPRSTVW]|6[ABCEGHJ-MPRSTV-Z]|7[ABCEGJ-NPRST]|8[EGHJ-NPRSTVW]|9[ABCGHK-NPRSTVWYZ])|M(1[BCEGHJ-NPRSTVWX]|2[HJ-NPR]|3[ABCHJ-N]|4[ABCEGHJ-NPRSTV-Y]|5[ABCEGHJ-NPRSTVWX]|6[ABCEGHJ-NPRS]|7[AY]|8[V-Z]|9[ABCLMNPRVW])|N(0[ABCEGHJ-NPR]|1[ACEGHKLMPRST]|2[ABCEGHJ-NPRTVZ]|3[ABCEHLPRSTVWY]|4[BGKLNSTVWXZ]|5[ACHLPRV-Z]|6[ABCEGHJ-NP]|7[AGLMSTVWX]|8[AHMNPRSTV-Y]|9[ABCEGHJKVY])|P(0[ABCEGHJ-NPRSTV-Y]|1[ABCHLP]|2[ABN]|3[ABCEGLNPY]|4[NPR]|5[AEN]|6[ABC]|7[ABCEGJKL]|8[NT]|9[AN])|R(0[ABCEGHJ-M]|1[ABN]|2[CEGHJ-NPRV-Y]|3[ABCEGHJ-NPRSTV-Y]|4[AHJKL]|5[AGH]|6[MW]|7[ABCN]|8[AN]|9[A])|S(0[ACEGHJ-NP]|2[V]|3[N]|4[AHLNPRSTV-Z]|6[HJKVWX]|7[HJ-NPRSTVW]|9[AHVX])|T(0[ABCEGHJ-MPV]|1[ABCGHJ-MPRSV-Y]|2[ABCEGHJ-NPRSTV-Z]|3[ABCEGHJ-NPRZ]|4[ABCEGHJLNPRSTVX]|5[ABCEGHJ-NPRSTV-Z]|6[ABCEGHJ-NPRSTVWX]|7[AENPSVXYZ]|8[ABCEGHLNRSVWX]|9[ACEGHJKMNSVWX])|V(0[ABCEGHJ-NPRSTVWX]|1[ABCEGHJ-NPRSTV-Z]|2[ABCEGHJ-NPRSTV-Z]|3[ABCEGHJ-NRSTV-Y]|4[ABCEGK-NPRSTVWXZ]|5[ABCEGHJ-NPRSTV-Z]|6[ABCEGHJ-NPRSTV-Z]|7[ABCEGHJ-NPRSTV-Y]|8[ABCGJ-NPRSTV-Z]|9[ABCEGHJ-NPRSTV-Z])|X(0[ABCGX]|1[A])|Y(0[AB]|1[A]))[ ]?[0-9]{1}[ABCEGHJ-NPRSTV-Z]{1}[0-9]{1}$

US ZIP codes
```
^[0-9]{5}(-[0-9]{4})?$
```

UK post codes

^([A-PR-UWYZ]([0-9]{1,2}|([A-HK-Y][0-9]|[A-HK-Y][0-9]([0-9]|[ABEHMNPRV-Y]))|[0-9][A-HJKS-UW])\ [0-9][ABD-HJLNP-UW-Z]{2}|(GIR\ 0AA)|(SAN\ TA1)|(BFPO\ (C\/O\ )?[0-9]{1,4})|((ASCN|BBND|[BFS]IQQ|PCRN|STHL|TDCU|TKCA)\ 1ZZ))$

It is not possible to guarantee accuracy without actually mailing something to an address and having the person let you know when they receive it, but we can narrow things by down by eliminating cases that we know are bad.

The extended version for Canadian Postal Codes might have something wrong or missing, as it says that the following postal code is invalid: E3G 0A1, although it is a valid one. — fsschmitt, Nov 25 '15 at 22:51
I have validated against all 845,495 postal codes in Canada and this regex string has some fixes on the Extended validation to support all of these postal codes. Here is the new regex string for the extended validation on Canadian Postal Codes: http://pastebin.com/vazqFKy4 — fsschmitt, Nov 25 '15 at 23:29

Gavin Miller · Answer 6 · 2009-02-23T20:25:37.293

8

We use the following:

Canada

([A-Z]{1}[0-9]{1}){3}   //We raise to upper first

America

[0-9]{5}                //-or-
[0-9]{5}-[0-9]{4}       //10 digit zip

Other

Accept as is

edited Feb 23 '09 at 20:25

answered Feb 23 '09 at 17:40

Gavin Miller

43,168
21
122
188

1

I'd suggest adding an optional -[0-9]{4} to the US one. Some people do use their ZIP+4. – David Thornley Feb 23 '09 at 20:01
4

/[0-9]{5}(?:-[0-9]{4})?/ lets you validate both styles from the US at the same time. – Chas. Owens May 11 '09 at 20:51
2

@Chas.Owens adding ^ and $ ensure they can't type anything else before or after, like "12345aaa" ... /^[0-9]{5}(?:-[0-9]{4})?$/ – Tim Franklin Jun 13 '13 at 14:55

score 7 · Answer 7 · answered Jul 25 '15 at 16:45

Please note that this is quite a hard problem, as stated by the accepted answer. I guess it didn't deter the folks at geonames.org though. They have a file a country info file, which doesn't fit whole into this answer - limit is at 30000 chars apparently. There are regexes for about 150 countries.

I extracted the bits relevant to this question here :

AD ^(?:AD)*(\d{3})$
AM ^(\d{6})$
AR ^([A-Z]\d{4}[A-Z]{3})$
AT ^(\d{4})$
AU ^(\d{4})$
AX ^(?:FI)*(\d{5})$
AZ ^(?:AZ)*(\d{4})$
BA ^(\d{5})$
BB ^(?:BB)*(\d{5})$
BD ^(\d{4})$
BE ^(\d{4})$
BG ^(\d{4})$
BH ^(\d{3}\d?)$
BM ^([A-Z]{2}\d{2})$
BN ^([A-Z]{2}\d{4})$
BR ^(\d{8})$
BY ^(\d{6})$
CA ^([ABCEGHJKLMNPRSTVXY]\d[ABCEGHJKLMNPRSTVWXYZ]) ?(\d[ABCEGHJKLMNPRSTVWXYZ]\d)$
CH ^(\d{4})$
CL ^(\d{7})$
CN ^(\d{6})$
CR ^(\d{4})$
CU ^(?:CP)*(\d{5})$
CV ^(\d{4})$
CX ^(\d{4})$
CY ^(\d{4})$
CZ ^(\d{5})$
DE ^(\d{5})$
DK ^(\d{4})$
DO ^(\d{5})$
DZ ^(\d{5})$
EC ^([a-zA-Z]\d{4}[a-zA-Z])$
EE ^(\d{5})$
EG ^(\d{5})$
ES ^(\d{5})$
ET ^(\d{4})$
FI ^(?:FI)*(\d{5})$
FM ^(\d{5})$
FO ^(?:FO)*(\d{3})$
FR ^(\d{5})$
GB ^(([A-Z]\d{2}[A-Z]{2})|([A-Z]\d{3}[A-Z]{2})|([A-Z]{2}\d{2}[A-Z]{2})|([A-Z]{2}\d{3}[A-Z]{2})|([A-Z]\d[A-Z]\d[A-Z]{2})|([A-Z]{2}\d[A-Z]\d[A-Z]{2})|(GIR0AA))$
GE ^(\d{4})$
GF ^((97|98)3\d{2})$
GG ^(([A-Z]\d{2}[A-Z]{2})|([A-Z]\d{3}[A-Z]{2})|([A-Z]{2}\d{2}[A-Z]{2})|([A-Z]{2}\d{3}[A-Z]{2})|([A-Z]\d[A-Z]\d[A-Z]{2})|([A-Z]{2}\d[A-Z]\d[A-Z]{2})|(GIR0AA))$
GL ^(\d{4})$
GP ^((97|98)\d{3})$
GR ^(\d{5})$
GT ^(\d{5})$
GU ^(969\d{2})$
GW ^(\d{4})$
HN ^([A-Z]{2}\d{4})$
HR ^(?:HR)*(\d{5})$
HT ^(?:HT)*(\d{4})$
HU ^(\d{4})$
ID ^(\d{5})$
IL ^(\d{5})$
IM ^(([A-Z]\d{2}[A-Z]{2})|([A-Z]\d{3}[A-Z]{2})|([A-Z]{2}\d{2}[A-Z]{2})|([A-Z]{2}\d{3}[A-Z]{2})|([A-Z]\d[A-Z]\d[A-Z]{2})|([A-Z]{2}\d[A-Z]\d[A-Z]{2})|(GIR0AA))$
IN ^(\d{6})$
IQ ^(\d{5})$
IR ^(\d{10})$
IS ^(\d{3})$
IT ^(\d{5})$
JE ^(([A-Z]\d{2}[A-Z]{2})|([A-Z]\d{3}[A-Z]{2})|([A-Z]{2}\d{2}[A-Z]{2})|([A-Z]{2}\d{3}[A-Z]{2})|([A-Z]\d[A-Z]\d[A-Z]{2})|([A-Z]{2}\d[A-Z]\d[A-Z]{2})|(GIR0AA))$
JO ^(\d{5})$
JP ^(\d{7})$
KE ^(\d{5})$
KG ^(\d{6})$
KH ^(\d{5})$
KP ^(\d{6})$
KR ^(?:SEOUL)*(\d{6})$
KW ^(\d{5})$
KZ ^(\d{6})$
LA ^(\d{5})$
LB ^(\d{4}(\d{4})?)$
LI ^(\d{4})$
LK ^(\d{5})$
LR ^(\d{4})$
LS ^(\d{3})$
LT ^(?:LT)*(\d{5})$
LU ^(\d{4})$
LV ^(?:LV)*(\d{4})$
MA ^(\d{5})$
MC ^(\d{5})$
MD ^(?:MD)*(\d{4})$
ME ^(\d{5})$
MG ^(\d{3})$
MK ^(\d{4})$
MM ^(\d{5})$
MN ^(\d{6})$
MQ ^(\d{5})$
MT ^([A-Z]{3}\d{2}\d?)$
MV ^(\d{5})$
MX ^(\d{5})$
MY ^(\d{5})$
MZ ^(\d{4})$
NC ^(\d{5})$
NE ^(\d{4})$
NF ^(\d{4})$
NG ^(\d{6})$
NI ^(\d{7})$
NL ^(\d{4}[A-Z]{2})$
NO ^(\d{4})$
NP ^(\d{5})$
NZ ^(\d{4})$
OM ^(\d{3})$
PF ^((97|98)7\d{2})$
PG ^(\d{3})$
PH ^(\d{4})$
PK ^(\d{5})$
PL ^(\d{5})$
PM ^(97500)$
PR ^(\d{9})$
PT ^(\d{7})$
PW ^(96940)$
PY ^(\d{4})$
RE ^((97|98)(4|7|8)\d{2})$
RO ^(\d{6})$
RS ^(\d{6})$
RU ^(\d{6})$
SA ^(\d{5})$
SD ^(\d{5})$
SE ^(?:SE)*(\d{5})$
SG ^(\d{6})$
SH ^(STHL1ZZ)$
SI ^(?:SI)*(\d{4})$
SK ^(\d{5})$
SM ^(4789\d)$
SN ^(\d{5})$
SO ^([A-Z]{2}\d{5})$
SV ^(?:CP)*(\d{4})$
SZ ^([A-Z]\d{3})$
TC ^(TKCA 1ZZ)$
TH ^(\d{5})$
TJ ^(\d{6})$
TM ^(\d{6})$
TN ^(\d{4})$
TR ^(\d{5})$
TW ^(\d{5})$
UA ^(\d{5})$
US ^\d{5}(-\d{4})?$
UY ^(\d{5})$
UZ ^(\d{6})$
VA ^(\d{5})$
VE ^(\d{4})$
VI ^\d{5}(-\d{4})?$
VN ^(\d{6})$
WF ^(986\d{2})$
YT ^(\d{5})$
ZA ^(\d{4})$
ZM ^(\d{5})$
CS ^(\d{5})$

Hopefully I didn't make any mistake, my regex-fu is pretty weak.

I would like to point out that the regex for France and Great Britain do not take into account possible spaces; In France, postal codes can be input with a space between the second and third digits (i.e. 75 001 instead of 75001). British post codes are quite often written with a space (i.e. SW1 1AA instead of SW11AA). — salcoin, Oct 22 '15 at 16:37
@salcoin Thanks for the input, I did not notice that (even though I am French). Looks like Chi's answer is better in this regard. — nha, Oct 23 '15 at 12:10
because str_replace a space with no space is super taxing right? :p — Robert Pounder, Dec 15 '16 at 10:08

Romko · Answer 8 · 2015-10-23T15:01:09.973

If someone is still interested in how to validate zip codes I've found a solution:

Using Google Geocoding API we can check validity of ZIP code having both Country code and a ZIP code itself.

For example I live in Ukraine so I can check like this: https://maps.googleapis.com/maps/api/geocode/json?components=postal_code:80380|country:UA

Or using JS API: https://developers.google.com/maps/documentation/javascript/geocoding#ComponentFiltering

Where 80380 is valid ZIP for Ukraine, actually every (#####) is valid.

Google returns ZERO_RESULTS status if nothing found. Or OK and a result if both are correct.

Hope this will be helpful.

The only issue would be the limit on the number of queries, which, depending on the site/size, could be an issue. — Darryl Hein, Oct 23 '15 at 17:42

score 7 · Answer 9 · edited May 11 '09 at 20:42

7

Depending on your application, you might want to implement regex matching for the countries where most of your visitors originate and no validation for the rest (accept anything).

edited May 11 '09 at 20:42

Dónal

185,044
174
569
824

answered Feb 23 '09 at 17:12

mbillard

38,386
18
74
98

score 5 · Answer 10 · answered May 11 '12 at 22:47

5

.*

Big Jump forgot about line breaks, blanks and control characters.

International postal codes are a kind of halting problem.

answered May 11 '12 at 22:47

user unknown

35,537
11
75
121

score 4 · Answer 11 · answered Mar 07 '17 at 23:04

As others have pointed out, one regex to rule them all is unlikely. However, you can craft regular expressions for as many countries as you need using the address formatting info from the Universal Postal Union -- a little-known UN agency.

For example, here are the address formatting rules, including postal code, for a handful of countries (PDF format):

score 2 · Answer 12 · edited Dec 22 '11 at 21:24

Given that there are so many edge cases for each country (eg. London addresses may use a slightly different format to the rest of the UK) I don't think that there is an ultimate regex other than maybe:

[0-9a-zA-Z]+

Best of going with a fairly broad pattern (well not quite as broad as the above), or treat each country/region with a specific pattern of its own!

UPDATE: However, it may be possible to dynamically construct a regex based upon lots of smaller, region specific rules - not sure about performance though!

Lots of country specific patterns can be found on the RegExLib site.

score 2 · Answer 13 · answered Feb 23 '09 at 19:08

The problem is going to be that you probably have no good means of keeping up with the changing postal code requirements of countries on the other side of the globe and which you share no common languages. Unless you have a large enough budget to track this, you are almost certainly better off giving the responsibility of validating addresses to google or yahoo.

Both companies provide address lookup facuilities through a programmable API.

score 1 · Answer 14 · edited Jul 01 '14 at 18:29

Somebody was asking about list of formatting mailing addresses, and I think this is what he was looking for...

Frank's Compulsive Guide to Postal Addresses: http://www.columbia.edu/~fdc/postal/ Doesn't help much with street-level issues, however.

My work uses a couple of tools to assist with this: - Lexis-Nexis services, including NCOA lookups (you'll get address standardization for "free") - "Melissa Data" http://www.melissadata.com

score 1 · Answer 15 · edited Jun 07 '19 at 12:19

1

This is a very simple RegEx for validating US Zipcode (not ZipCode Plus Four):

(?!([089])\1{4})\d{5}

Seems all five digit numeric are valid zipcodes except 00000, 88888 & 99999.

I have tested this RegEx with http://regexpal.com/

SP

edited Jun 07 '19 at 12:19

Cà phê đen

1,883
2
21
20

answered Nov 13 '12 at 15:38

Som Poddar

1,428
1
15
22

This RegEx does not enforce four digits for the zip+4 portion. E.g. it considers "92122-1" a valid zip code. – Sensei James Mar 24 '20 at 01:03

score 1 · Answer 16 · answered Feb 23 '09 at 17:18

1

Why are you doing this and why do you care? As Tom Ritter pointed out, it doesn't matter whether you even have a ZIP/postal code at all, much less whether it's valid or not, until and unless you are actually going to be sending something to that address. Even if you expect that you will be sending them something someday, that doesn't mean you need a postal code today.

answered Feb 23 '09 at 17:18

Dave Sherohman

45,363
14
64
102

Yeah but if they're going to be entering one, might as well make sure it's correct at that point. However, I agree with one of the other answers that basically says, make it validate for the countries that you think will be the majority of your customers. – cdmckay Feb 23 '09 at 19:12
1

Some credit clearing houses will not accept a bill unless the zip is correct. I would rather validate the zip on input, rather than submit the charge and have it rejected. – SamGoody Jan 24 '12 at 09:50

score 1 · Answer 17 · answered Feb 23 '09 at 17:20

1

As noted elsewhere the variation around the world is huge. And even if something that matches the pattern does not mean it exists.

Then, of course, there are many places where postcodes are not used (e.g. much or Ireland).

answered Feb 23 '09 at 17:20

Richard

106,783
21
203
265

Actually, probably all of Ireland, as I don't think D1, D2, etc. are considered proper post codes as you can't identify an address using just this code and a street number. – Dónal May 11 '09 at 20:44

score 1 · Answer 18 · answered May 11 '09 at 20:35

1

There are reasons beyond shipping for having an accurate postal code. Travel agencies doing tours that cross borders (Eurozone excepted of course) need this information ahead of time to give to the authorities. Often this information is entered by an agent that may or may not be familiar with such things. ANY method that can cut down on mistakes is a Good Idea™

However, writing a regex that would cover all postal codes in the world would be insane.

answered May 11 '09 at 20:35

1

It is only a good idea until the code starts rejecting valid zipcodes either because it is buggy or the zipcodes have changed. Validation is something that must either be right or not there at all. At the very least there should be an override option. – Chas. Owens May 11 '09 at 20:49

score 0 · Answer 19 · edited Jun 07 '19 at 09:57

0

If Zip Code allows characters and digits (alphanumeric), below regex would be used where it matches, 5 or 9 or 10 alphanumeric characters with one hypen (-):

^([0-9A-Za-z]{5}|[0-9A-Za-z]{9}|(([0-9a-zA-Z]{5}-){1}[0-9a-zA-Z]{4}))$

edited Jun 07 '19 at 09:57

Cà phê đen

1,883
2
21
20

answered Jun 07 '19 at 09:05

Vivek Kalekere

21
1
6

user3793935 · Answer 20 · 2022-01-28T07:33:53.637

I know this is an old quesiton, but I stumbled across the same problem. I have invoices from over 100 countries and am trying to get the correkt creditor over the zip (if every other check is failing). So what I did is writing a short Python Script, that creates a pattern from a string:

class RegexPatternBuilder:
    """
    Builds a regex pattern out of a given string(i.e. --> HM452 AX2155 : [A-Z]{2}\d{3}\s{1}[A-Z]{2}\d{4})
    """
    __is_alpha_count = 0
    __is_numeric_count = 0
    __is_whitespace_count = 0
    __pattern = ""

    # Count: wich character of the string we're locking at right now
    __count = 0

    # Countrys like  Andora starts theire ZIP with the country abbreviation :AD500
    # So check at first if the ZIP starts with the abbreviation and if so, add it to the pattern and increase the count.
    def __init__(self, zip_string, country):
        self.__zip_string = zip_string
        self.__country = country
        if self.__zip_string.startswith(country):
            self.__pattern = f'({self.__country})'
            self.__count += len(self.__country)

    def build_regex(self):
        # Last step ;
        # Add the current alpha_numeric pattern with count
        if len(self.__zip_string) == self.__count:
            if self.__is_alpha_count:
                self.__pattern += f"[A-Z]{{{self.__is_alpha_count}}}"
            if self.__is_numeric_count:
                self.__pattern += f"\d{{{self.__is_numeric_count}}}"
            return f'{self.__pattern}\\b'

        # Case: Whitespace
        # Check if there is a crossing from numeric / alphanumeric to whitespace,
        # if so --> add the alpha_numeric regex to the whole pattern with the
        # count as the number of viable appeaerances.
        # Since there is max 1 whitespace in a ZIP, add the whitespace regex immediately.
        # Every other case is similar to that.
        if self.__zip_string[self.__count].isspace():
            if self.__is_numeric_count:
                self.__pattern += f"\d{{{self.__is_numeric_count}}}"
            if self.__is_alpha_count:
                self.__pattern += f"[A-Z]{{{self.__is_alpha_count}}}"
            self.__pattern += "\s{1}"
            self.__is_whitespace_count += 1
            self.__is_alpha_count = 0
            self.__is_numeric_count = 0

        # Case: Is Alphanumeric
        if self.__zip_string[self.__count].isalpha():
            if self.__is_numeric_count:
                self.__pattern += f"[0-9]{{{self.__is_numeric_count}}}"
            self.__is_whitespace_count = 0
            self.__is_alpha_count += 1
            self.__is_numeric_count = 0

        # Case: Is Numeric
        if self.__zip_string[self.__count].isnumeric():
            if self.__is_alpha_count:
                self.__pattern += f"[A-Z]{{{self.__is_alpha_count}}}"
            self.__is_whitespace_count = 0
            self.__is_alpha_count = 0
            self.__is_numeric_count += 1

        # Case: Special Character (i.e. - )
        # No escaping or count for this so far, because it shouldn't be needed for our zip purposes
        if not self.__zip_string[self.__count].isalpha() \
                and not self.__zip_string[self.__count].isnumeric() \
                and not self.__zip_string[self.__count].isspace():
            self.__pattern += f'{self.__zip_string[self.__count]}{{1}}'
        self.__count += 1
        return self.build_regex()

With that I created all the different possible regexes for all zips (by country) we have historically and wrote them back into a db table (i.e. something like this in the end: COUNTRY:RE PATTERN:(\d{5})\b [what ever country this might be ;D])

Maybe it helps someone.

What is the ultimate postal code and zip regex?

20 Answers20

Linked

Related