2

Looking to extract addresses from the street number to the next street number. For all street numbers. For example to use in a Python script like this snippet:

pattern =  (\d+[^0-9]+)(?!\d) # specifying the search pattern
streetSuburbJob = re.findall(pattern, text) # Return a list of strings that match the pattern

Using regex101 I'm getting close with my pattern. But there is one match I'm missing and this is going to get me when I apply this to the full data set. number to number

See regex demo

The use of the slash to describe a flat number 3/40A complicates things. How do I modify my regex to allow for the situation where the street address number contains this character?

Dave
  • 687
  • 7
  • 15
  • 1
    Good question and thanks for showing your efforts. Please also add sample of input and expected output in form of text in your question to make it more clear, thank you. – RavinderSingh13 May 12 '23 at 06:00

1 Answers1

3

You may use this regex:

\d+(?:/\d+)?\D+

RegEx Demo

RegEx Details:

  • \d+: Match 1+ of digit
  • (?:/\d+)?: Match / followed by 1+ digits. Make this group an optional match
  • \D+: Match 1+ of non-digits

A note on removal of negative lookahead (?!\d):

Since we are matching 1+ of non-digits as pattern, it will anyway stop match before matching a digit, moreover addition of (?!\d) will prevent this regex to match last character in the match since next adjacent character would be a digit.

anubhava
  • 761,203
  • 64
  • 569
  • 643
  • 1
    there should only be one `/`, so I think `(?:\d+\/)?\d+\D+` is better – Nick May 12 '23 at 05:56
  • 1
    Fair enough, but you wouldn't see more than one in New Zealand. – Nick May 12 '23 at 07:08
  • I can't mark this answer as correct because it fails in https://regex101.com/. Nick is correct re: slash symbol. His regex also works. However I'm still trying to figure it out. BTW, for anyone interested, I found this post useful https://stackoverflow.com/questions/3512471/what-is-a-non-capturing-group-in-regular-expressions – Dave May 12 '23 at 19:58
  • Sorry can you clarify which part fails on regex101 demo: https://regex101.com/r/ijihst/2 ? – anubhava May 12 '23 at 20:12
  • 1
    Okay. So you edited your answer after I posted. So I will mark your new answer as correct. – Dave May 13 '23 at 04:23