0

The task: I need to find all abbreviations of address object identifiers in a string using a list of said abbreviations. (To delete them later). (Abbreviations list is in another language and is waaaay bigger (200+ elements), so foreach is out of question due to "complex regex beats foreach in speed").

The problem: Regex like this (?:[^\w\d]|\A)(?:street|str|c|city|state|st|apt)([^\w\d]|\Z) works on a string like this: Klutc state, Beast st, apt c5 and correcttly gives state, st, apt.

But on a string: state Klutc, Beast st,apt c5 it returns state and st, but not apt, because the [^\w\d] is somewhat stolen by the previous st

I also cannot use just the (?:[^\w\d]|\A)(?:street|str|c|city|state|st|apt) (left side) because it will not work on Klutc state, Beast st, apt c5 and give c from c5

Neither can I use only the right side (?:street|str|c|city|state|st|apt)([^\w\d]|\Z) because on a string Klutc state, Beast st, apt c5 it will return st from beast and c from Klutc.

The question: How should I rewrite the regex, so it correctly return the abbreviations only? (Make st, not steal , from ,apt, i.e. make st and apt both use the same ,). Test inputs are:

Klutc state, Beast st, apt c5

state Klutc, Beast st,apt c5

Klutc State,Beast st,c5 apt
ProgrammingLlama
  • 36,677
  • 7
  • 67
  • 86
Alex
  • 83
  • 7
  • 1
    it's not apparent in your question how this relates to `c#`. are you expecting the answer to be using `System.Text.RegularExpressions` namespace, `Regex` type.. do you have any code as it relates to this in c#? – Brett Caswell Jun 01 '21 at 04:08
  • Instead of non-capturing group, (?:[^\w\d]|\A) use positive lookbehind at the beginning (?<=[^\w\d]|\A) and positive lookahead at the end (?=[^\w\d]|\Z). Read up on this very useful feature. – Chris Maurer Jun 01 '21 at 04:33
  • @ChrisMaurer, Many thanks! How can I mark your comment as an answer? – Alex Jun 01 '21 at 05:05
  • You can't, but you can ask them to post it as an answer and then accept it, or post it yourself (perhaps as a "community wiki" where you abstain from reaping any reputation from any upvotes); but probably this is too vague and a duplicate. Please review [the `regex` tag guidance](/tags/regex/info) and update your question to provide the details which have been requested above. – tripleee Jun 01 '21 at 05:12

0 Answers0