14

I wrote a regex for php function pregmatch which is like this:

^([a-zA-Z]){4}([a-zA-Z]){2}([0-9a-zA-Z]){2}([0-9a-zA-Z]{3})?$^

Now I need to check the consistency of an BIC string.

Something is wrong with it... it is always correct. And I have no idea why.

The code I use is something like this:

/**
 * Checks the correct format from the 
 * @param string $bic
 * @return boolean
 */
public function checkBic($bic)
{
    $bic = $this->cleanFromSeparators($bic);
    if (preg_match($this->getBicCompare(), $bic)) {
        return true;
    } else {
        return false;
    }
}

private function getBicCompare()
{
    return "^([a-zA-Z]){4}([a-zA-Z]){2}([0-9a-zA-Z]){2}([0-9a-zA-Z]{3})?$^";
}

EDIT:

Here are some references for BIC format from the swift account:

http://www.sage.co.uk/sage1000v2_1/form_help/workingw/subfiles/iban_and_bic.htm

http://en.wikipedia.org/wiki/ISO_9362

http://www.swift.com/products_services/bic_and_iban_format_registration_bic_details?rdct=t

an example BIC would be:

NOLADE21STS

OPSKATWW

The regex should only return true if the string consists of the following code: its length is eight or eleven characters and that consists of:

Bank code - 4 alphabetic characters Country code - 2 letters Location code - 2 alphanumeric characters, except zero Branch code - 3 alphanumeric characters

These are the specifications.

So the length can be either 11 or 8, first 4 can be anything, then 2 letters is a must, then 2 numbers and optional 3 alphanumeric.

The following are not valid:

abcdefxx

abcdefxxyyy

These also are not valid:

aaaa11xx

aaaa11xxyyy

and so on.

MrBoJangles
  • 12,127
  • 17
  • 61
  • 79
Sangoku
  • 1,588
  • 2
  • 21
  • 50

8 Answers8

17

You are using ^ as delimiter? You probably want something more like:

'/^[a-z]{6}[0-9a-z]{2}([0-9a-z]{3})?\z/i'
Qtax
  • 33,241
  • 9
  • 83
  • 121
  • "*When using the PCRE functions, it is required that the pattern is enclosed by delimiters. A delimiter can be any non-alphanumeric, non-backslash, non-whitespace character.*" [Source](http://php.net/manual/en/regexp.reference.delimiters.php). It shouldn't be a problem to use `^` as a delimiter. – Loamhoof Apr 10 '13 at 08:09
  • @Loamhoof, it's a problem because he wants to anchor the expression. That's why I guess `^` is used as first character. – Qtax Apr 10 '13 at 08:10
  • To be honest .... tht was just coincidence... i started the regex as ^()^ piggy :) – Sangoku Apr 10 '13 at 08:12
  • @Qtax surely he wants to do that, but without an example of what's matching while it should not, we can't be sure it's the real problem. (I guess `^^` should work, but still be ugly.) – Loamhoof Apr 10 '13 at 08:12
  • @Loamhoof, `^^` wouldn't work, you'd need to escape the anchor in that case, like `^\^`, but I'm unsure if that would get you the literal or the meta meaning of the character in PHP. In that case probably clearer to use `^\A`. – Qtax Apr 10 '13 at 08:19
  • Yeah you're right. Well, anyway we'll have to wait for an OP update. – Loamhoof Apr 10 '13 at 08:23
  • gave not valid and valid examples. – Sangoku Apr 10 '13 at 09:31
  • Almost ok but i made some adjustments to it and now it is perfect :) – Sangoku Apr 10 '13 at 10:54
  • 5
    PHP snippet `$result_bic = (bool) ( preg_match('/^[a-z]{6}[0-9a-z]{2}([0-9a-z]{3})?\z/i', $bic) == 1 );` – Jaro Jul 31 '14 at 08:10
9

Structure

The latest edition is ISO 9362:2009 (dated 2009-10-01). The SWIFT code is 8 or 11 characters, made up of:

4 letters: Institution Code or bank code.

2 letters: ISO 3166-1 alpha-2 country code

2 letters or digits: location code

if the second character is "0", then it is typically a test BIC as opposed to a BIC used on the live network. if the second character is "1", then it denotes a passive participant in the SWIFT network if the second character is "2", then it typically indicates a reverse billing BIC, where the recipient pays for the message as opposed to the more usual mode whereby the sender pays for the message.

3 letters or digits: branch code, optional ('XXX' for primary office)

(http://en.wikipedia.org/wiki/ISO_9362)

(different definition in German-Wiki http://de.wikipedia.org/wiki/ISO_9362)

2 letters or digits: location code The first character must not be the digit "0" or "1". The letter 'O' is not allowed as a second character. (Regex for this definition: [2-9a-z][0-9a-np-z])

'/^[a-z]{6}[2-9a-z][0-9a-np-z]([a-z0-9]{3}|x{3})?$/i'
chiborg
  • 26,978
  • 14
  • 97
  • 115
jack88
  • 91
  • 1
  • 2
  • 3
    Adding a breakdown of how the separate sections of your regex relate to the specification would be useful. – forivall Sep 27 '13 at 18:36
  • 3
    The German Wikipedia also says that the optional branch name must not start with an "X" except when it's XXX. Then the regex would be `'/^[A-Z]{6}[2-9A-Z][0-9A-NP-Z](XXX|[0-9A-WYZ][0-9A-Z]{2})?$/i'` – chiborg May 04 '18 at 13:02
6

This is the official SEPA pattern for validating BIC

[A-Z]{6,6}[A-Z2-9][A-NP-Z0-9]([A-Z0-9]{3,3}){0,1}
Timo
  • 69
  • 1
  • 1
2

I think this one would do:

/^[a-z0-9]{4}[a-z]{2}\d{2}([a-z0-9]{3})?$/

That is:

  1. start of string, ^
  2. four alphanumeric chars, [a-z0-9]{4}
  3. two numbers, \d{2}
  4. three optional (? suffix) alphanumerich chars, ([a-z0-9]{3})?
  5. end of string, $

You can see it in action and test it here (I used your samples). Anyway, from the rules you are reporting, OPSKATWW shouldn't be a valid BIC since it has no numbers after the first 6 letters.

Paolo Stefan
  • 10,112
  • 5
  • 45
  • 64
  • I need the z/i to make sure it is case insensitive. the \d{2} part is good. will note it. /d decimal. nice one. – Sangoku Apr 10 '13 at 17:08
  • Sorry, I meant `/^[a-z0-9]{4}[a-z]{2}\d{2}([a-z0-9]{3})?$/i`. What is the `z` for? Anyway, you could also do a `$bic=strtolower($bic)` before testing it against this regex. – Paolo Stefan Apr 11 '13 at 07:55
1

I do not recommend using this as it's lacking performance, but in case someone needs to verify IBANS also against ISO-3366-1 country codes:

/^[A-Z]{4}(AC|AD|AE|AF|AG|AI|AL|AM|AN|AO|AQ|AR|AS|AT|AU|AW|AX|AZ|BA|BB|BD|BE|BF|BG|BH|BI|BJ|BL|BM|BN|BO|BQ|BR|BS|BT|BV|BW|BY|BZ|CA|CC|CD|CE|CF|CG|CH|CI|CK|CL|CM|CN|CO|CP|CR|CS|CU|CV|CW|CX|CY|CZ|DD|DE|DG|DJ|DK|DM|DO|DZ|EA|EC|EE|EG|EH|ER|ES|ET|EU|FI|FJ|FK|FM|FO|FR|FX|GA|GB|GD|GE|GF|GG|GH|GI|GL|GM|GN|GP|GQ|GR|GS|GT|GU|GW|GY|HK|HM|HN|HR|HT|HU|IC|ID|IE|IL|IM|IN|IO|IQ|IR|IS|IT|JE|JM|JO|JP|KE|KG|KH|KI|KM|KN|KP|KR|KW|KY|KZ|LA|LB|LC|LI|LK|LR|LS|LT|LU|LV|LY|MA|MC|MD|ME|MF|MG|MH|MK|ML|MM|MN|MO|MP|MQ|MR|MS|MT|MU|MV|MW|MX|MY|MZ|NA|NC|NE|NF|NG|NI|NL|NO|NP|NR|NT|NU|NZ|OM|PA|PE|PF|PG|PH|PK|PL|PM|PN|PR|PS|PT|PW|PY|QA|RE|RO|RS|RU|RW|SA|SB|SC|SD|SE|SF|SG|SH|SI|SJ|SK|SL|SM|SN|SO|SR|SS|ST|SU|SV|SX|SY|SZ|TA|TC|TD|TF|TG|TH|TJ|TK|TL |TM|TN|TO|TP|TR|TT|TV|TW|TZ|UA|UG|UK|UM|US|UY|UZ|VA|VC|VE|VG|VI|VN|VU|WF|WS|XK|YE|YT|ZA|ZM|ZR|ZW)[2-9A-Z][0-9A-NP-Z](XXX|[0-9A-WYZ][0-9A-Z]{2})?$/i

Also this is content validation and not just format validation. But it will save you a bit of code :)

Adriano
  • 482
  • 5
  • 18
serraine
  • 11
  • 1
1
echo preg_match('/^[A-Z]{4}(AC|AD|AE|AF|AG|AI|AL|AM|AN|AO|AQ|AR|AS|AT|AU|AW|AX|AZ|BA|BB|BD|BE|BF|BG|BH|BI|BJ|BL|BM|BN|BO|BQ|BR|BS|BT|BV|BW|BY|BZ|CA|CC|CD|CE|CF|CG|CH|CI|CK|CL|CM|CN|CO|CP|CR|CS|CU|CV|CW|CX|CY|CZ|DD|DE|DG|DJ|DK|DM|DO|DZ|EA|EC|EE|EG|EH|ER|ES|ET|EU|FI|FJ|FK|FM|FO|FR|FX|GA|GB|GD|GE|GF|GG|GH|GI|GL|GM|GN|GP|GQ|GR|GS|GT|GU|GW|GY|HK|HM|HN|HR|HT|HU|IC|ID|IE|IL|IM|IN|IO|IQ|IR|IS|IT|JE|JM|JO|JP|KE|KG|KH|KI|KM|KN|KP|KR|KW|KY|KZ|LA|LB|LC|LI|LK|LR|LS|LT|LU|LV|LY|MA|MC|MD|ME|MF|MG|MH|MK|ML|MM|MN|MO|MP|MQ|MR|MS|MT|MU|MV|MW|MX|MY|MZ|NA|NC|NE|NF|NG|NI|NL|NO|NP|NR|NT|NU|NZ|OM|PA|PE|PF|PG|PH|PK|PL|PM|PN|PR|PS|PT|PW|PY|QA|RE|RO|RS|RU|RW|SA|SB|SC|SD|SE|SF|SG|SH|SI|SJ|SK|SL|SM|SN|SO|SR|SS|ST|SU|SV|SX|SY|SZ|TA|TC|TD|TF|TG|TH|TJ|TK|TL|TM|TN|TO|TP|TR|TT|TV|TW|TZ|UA|UG|UK|UM|US|UY|UZ|VA|VC|VE|VG|VI|VN|VU|WF|WS|XK|YE|YT|ZA|ZM|ZR|ZW)[2-9A-Z][0-9A-NP-Z]([A-Z0-9]{3}|x{3})?$/',$val);
1

The most recent ISO 9362:2022(E) allows the following strings as BICs:

^[A-Z0-9]{4}[A-Z]{2}[A-Z0-9]{2}(?:[A-Z0-9]{3})?$

Note: Only upper-case letters are allowed; so don’t ignoreCase.


If one wishes, one could check against the list of country codes:

^[A-Z0-9]{4}(?:AD|AE|AF|AG|AI|AL|AM|AO|AQ|AR|AS|AT|AU|AW|AX|AZ|BA|BB|BD|BE|BF|BG|BH|BI|BJ|BL|BM|BN|BO|BQ|BR|BS|BT|BV|BW|BY|BZ|CA|CC|CD|CF|CG|CH|CI|CK|CL|CM|CN|CO|CR|CU|CV|CW|CX|CY|CZ|DE|DJ|DK|DM|DO|DZ|EC|EE|EG|EH|ER|ES|ET|FI|FJ|FK|FM|FO|FR|GA|GB|GD|GE|GF|GG|GH|GI|GL|GM|GN|GP|GQ|GR|GS|GT|GU|GW|GY|HK|HM|HN|HR|HT|HU|ID|IE|IL|IM|IN|IO|IQ|IR|IS|IT|JE|JM|JO|JP|KE|KG|KH|KI|KM|KN|KP|KR|KW|KY|KZ|LA|LB|LC|LI|LK|LR|LS|LT|LU|LV|LY|MA|MC|MD|ME|MF|MG|MH|MK|ML|MM|MN|MO|MP|MQ|MR|MS|MT|MU|MV|MW|MX|MY|MZ|NA|NC|NE|NF|NG|NI|NL|NO|NP|NR|NU|NZ|OM|PA|PE|PF|PG|PH|PK|PL|PM|PN|PR|PS|PT|PW|PY|QA|RE|RO|RS|RU|RW|SA|SB|SC|SD|SE|SG|SH|SI|SJ|SK|SL|SM|SN|SO|SR|SS|ST|SV|SX|SY|SZ|TC|TD|TF|TG|TH|TJ|TK|TL|TM|TN|TO|TR|TT|TV|TW|TZ|UA|UG|UM|US|UY|UZ|VA|VC|VE|VG|VI|VN|VU|WF|WS|YE|YT|ZA|ZM|ZW)[A-Z0-9]{2}(?:[A-Z0-9]{3})?$

Please keep the performance in mind as already mentioned by serraine.

#fyi One can find the list of country codes here: https://www.iso.org/obp/ui.

Mue
  • 434
  • 5
  • 12
-2

Ok. to all who have the same problems with this kind of problem the correct regex is:

/^[0-9a-z]{4}[a-z]{2}[0-9a-z]{2}([0-9a-z]{3})?\z/i

thy @Qtax for providing it. i just refined it a bit.

Edit the tweak was that i changed it so the first 4 letters can be alphanumeric, but the 2 letters after it have to represent a international code for a country. That is why only letters. And i checked it with actual users who have real use codes. They can have numeric values in first 4 positions.

Edit:

I was in wrong. the firtst 4 cann be only letters. I was relaying on an statment of a employe from Reifeisen Bank with wich i was discussing the standard. It turner out he was thinking the number of the bank from their internal no idea what system cann be a valid code. As it turned out that is not the case.

So the correct sintax is

/^{6}[a-z]{2}[0-9a-z]{2}([0-9a-z]{3})?\z/i

Will mark the correct answer. Thank you for pointing it out.

Sangoku
  • 1,588
  • 2
  • 21
  • 50
  • 1
    Why did you add digits to the first 4 characters? That's not in your question and not in the linked spec either. – Qtax Apr 10 '13 at 13:23
  • That is the code for country and a nummber. Trust me this one is ok :) – Sangoku Jun 18 '13 at 06:29
  • To all who keep grading with minus 4 letters can contain numbers for some countries, but there HAS to be a 2 nuber digit on end of it or the check will fail for the bank. I have direct call to bank api after this call. – Sangoku Oct 01 '13 at 12:15
  • The last regex from Sangoki is *not* correct. The first four characters of a BIC need to be letters NOT numbers. I think the correct one would be: /^[a-z]{4}[a-z]{2}[0-9a-z]{2}([0-9a-z]{3})?\z/i Or - to make it shorter /^[a-z]{6}[0-9a-z]{2}([0-9a-z]{3})?\z/i – Armin Hierstetter Oct 07 '13 at 10:56
  • Not anymore. The first four characters can be digits. See https://stackoverflow.com/a/73048163/1249581. – VisioN Aug 18 '23 at 09:43