Best javascript regex for an address number

Question

I have a collection of all uppercase address names and numbers and I want to extract just the first encountered address number for each address. The following examples show what I would like to extract from each:

80 ROSE COTTAGE -> 80
80A ROSE COTTAGE -> 80A
80 A ROSE COTTAGE -> 80 A
80ROSE COTTAGE -> 80 (accidental no-space)
[ANY OTHER TEXT] 80 ROSE COTTAGE -> 80

I have found some similar questions answered here and elsewhere on the internet, but they always deal with an address as a whole as opposed to specifically just address name and number:

Match each address from the address number to the 'street type'

regex street address match

Regular Expression: Any character that is NOT a letter or number

javascript regular expressions address number

JavaScript regex to validate an address

The last one makes reference to a lookahead, which lead me to construct a negative look ahead for any alphanumeric characters following a potential single text character(eg. 80 A) in my JavaScript regex. However without adding the alternative "digits only found" group (\d+) my fourth example above does not return just the number.

(?:\d+\s*[A-Z]?(?![A-Z0-9]))|(?:\d+))

Is there a way to combine these two groups into a single regex expression? Or is this not possible in JavaScript's regex implementation?

Any help with this would be greatly appreciared.

Does it really have to be that complicated? An address usually has only one number which must be the number you are looking for. If it is followed by a character directly like in `80A` or if it is followed by a character encased in spaces like in `80 A ` then that is what you are looking for. — Ke Vin, Sep 22 '14 at 11:04
/hi thanks for your reply. The dataset is not perfect and as with my last two examples sometimes the number is not at the start, or a word following the number without a seperating space. Without using the look ahead, i found that 80ROSECOTTAGE would result in 80R when it should just be 80. Thus I have currently added the digit only alternative group. This works, but I am wondering if there is a way to combine without having the groups. — Derek, Sep 22 '14 at 11:23

vks · Accepted Answer · 2014-09-22T15:52:07.143

0

(\d+\s*(?:[A-Z](?![A-Z]))?)

You can try this.

See demo.

http://regex101.com/r/kM7rT8/13

edited Sep 22 '14 at 15:52

answered Sep 22 '14 at 11:43

vks

67,027
10
91
124

Hi, I am attempting to apply your suggestion to the first group in my regular expression (so that I can drop the second). The best I seem to do is: \d+\s*[A-Z]?(?![A-Z]{2,}) This works fine apart from it seems to drop the zero from 80 when the example text is '80ROSE COTTAGE'. Did you mean to apply this somehow else? What I require is: [At least one digit(s)] followed by [Any or no whitespace] followed by [Any one 'A to Z' that is not followed by another 'A to Z'] (or failing the last part, just the original digit(s)). I hope that makes sense? – Derek Sep 22 '14 at 14:50
@Derek you have to use .replace function.And replace with ``.and just use the regex given and nothing else. – vks Sep 22 '14 at 14:55
Hi sorry I didnt realise that was the case. Am I right in thinking that carrying out a replace using this on 'A 80 A' would return 'A 80 A' as opposed to '80 A'? – Derek Sep 22 '14 at 15:22
@Derek it would return `A 80 A` – vks Sep 22 '14 at 15:22
@Derek http://regex101.com/r/kM7rT8/12 – vks Sep 22 '14 at 15:24
Thanks for that, but I don't need a regular expression to replace all multiple 'A to Z' occurances with "", because as is the case with 'A 80 A' or '80 A N A D D R E S S W I T H T O O M A N Y S P A C E S', this would return false positives. I am trying to specifically pattern match: [At least one digit(s)] followed by [Any or no whitespace] followed by [Any one 'A to Z' that is not followed by another 'A to Z'] (or failing the last part, just the original digit(s)). I have managed to do this with the two groups, but I am wondering if it can be made into a single expression with no groups. – Derek Sep 22 '14 at 15:47
@Derek try now the new regex – vks Sep 22 '14 at 15:52
vks thank you very much, that is exactly what I was after... I hadn't considered grouping the character after digits check together as you have done in that last epxression. That is very helpful, and very much appreciated :-) – Derek Sep 22 '14 at 16:08
@Derek glad we could do it finally. :) – vks Sep 22 '14 at 16:09

Best javascript regex for an address number

1 Answers1