2

i know that regex on postal address cannot be optimal, but i really one here .

I explain my problem : I got different types of postal address :

  • 32 Rue Jean Jaures 69000 Lyon
  • Bâtiment 1 32 Rue Jean Jaures 69000 Lyon
  • 32 B Jean Jaures 69000 Lyon
  • Bâtiment 1 32 B Rue Jean Jaures 69000 Lyon

I need a regexp to find only the number of the street in any position.

I have done a regexp which permit to determine the number if it is at the beginning of the string :

`^([1-9][0-9]{0,2}(?:\s*[A-Z])?)\b`

You can see the result here : https://regex101.com/r/dY7cE6/3

But the problem is that i can't find it if it's not the first number of my string (like this address : Bâtiment 1 32 Rue Jean Jaures 69000 Lyon)

So i ask you help to find in any situation this street's number wich is here "32".

I keep search on my own but help will be appreciate .

Thank you .

Niranjan N Raju
  • 12,047
  • 4
  • 22
  • 41
VERYNET
  • 531
  • 2
  • 14
  • 1
    Can you explain in words the pattern you want to define? e.g. (I'm guessing from your examples) would "always the last number in the string less than 5 digits long" do? – lessthanideal Oct 06 '15 at 12:24
  • This is exactly this i want "always the last number in the string, except if it is 5 numbers from postal code". – VERYNET Oct 06 '15 at 12:26
  • This will **not** end well if you're dealing with multiple countries - especially Ireland. If you absolutely, positively have to reformat addresses you'd be better off using something like PostcodeAnywhere or Experian EDQ. – CD001 Oct 06 '15 at 12:43

3 Answers3

1

The following regex will find the second to last number in the string (in the capture group). If there is a single letter after this number (separated by a space or not) it will also capture this.

It requires the last number in the string to be a five digit postcode:

/(\d+(?:\s*\w\b)?)[^\d]+\d{5}[^\d]+$/

However, how reliably do you need to identify house number? What is the possible range of input data? No regex approach is likely to be very good. This question and answers give some idea of the problems.

See it working on the sample data.

Community
  • 1
  • 1
  • problem is, if the number is in first position in the string that doesn't work as you can see on this example : https://regex101.com/r/dY7cE6/4 , in the first case you don't match the "32" All possibles examples of adress datas are in my question . Number is pretty important to identify someone, and i know that this is not simple to do regex on address or not really optimal but i didn't found an othe way to do it – VERYNET Oct 06 '15 at 12:31
  • you're right sorry, one last problem you don't catch "B" if there is one in the address example :"32B" , sorry i had to be more specific on it , but i though my regex was explicit. Maybe you can just modify my regex which take "B" in parameter to help me . – VERYNET Oct 06 '15 at 12:37
  • last little question look at this case : https://regex101.com/r/dY7cE6/6 . Guess you're regex works if street's number got at least 2 numbers and not 1 .. – VERYNET Oct 06 '15 at 12:56
1

Last number in the string, unless it's 5 digits, optionally capturing a single letter after the number:

^.*\b(?!\d{5}[A-Z]?\b)(\d+(?:\s*[A-Z]\b)?)
Mariano
  • 6,423
  • 4
  • 31
  • 47
  • you're regex is ok but i need also to catch "B" if there is one in the address example :"32B" , sorry i had to be more specific on it , but i though my regex was explicit. As you can see in my regex example in my question i catch the "B" in every case . – VERYNET Oct 06 '15 at 12:39
  • It was pretty obvious in the regex. :-) Edited – Mariano Oct 06 '15 at 12:44
1
\b(?!\d{5}\b)\d+\b(?:\s*\w\b)?(?=\D*\b\d{5}\b|\D*$)

Try this.See demo.

https://regex101.com/r/cJ6zQ3/20

vks
  • 67,027
  • 10
  • 91
  • 124
  • you're regex is really good but as i said in precedent comments, i need to catch "B" if there is one in the address example :"32B" , sorry i had to be more specific on it , but i though my regex was explicit. As you can see in my regex example in my question i catch the "B" in every case . – VERYNET Oct 06 '15 at 12:39