1

I am struggling to find a method to extract the first two lines of an address using a regular expression, where it doesn't include the word "Account".

If we take this address:

Company Name
Some Road
Some Town

I can use the regular expression (?:.*\s*){2} to return

Company Name Some Road

Which is great.

However, if there is an extra line at the top, making the address become:

Accounts Payable
Company Name
Some Road
Some Town

Then it no longer picks up those two lines that I want.

I have tried the method here: Regular expression to match a line that doesn't contain a word? without success, and have also tried combinations of using things like (?!Account.*)(?:.*\s*){3}, but am having little success.

The Microsoft website https://learn.microsoft.com/en-us/dotnet/standard/base-types/regular-expression-language-quick-reference has masses of characters etc to use, but I haven't managed to get a combination working yet.

The closest I've got was using [^Account.*](?:.*\s*){3} which returns

s Payable Company Name Some Road

I just can't get it to remove the rest of that line! Any help would be appreciated. Thanks.

Joe
  • 616
  • 2
  • 12
  • 27
  • 1
    Try `^(?!Accounts)(?:.*\n?){2}` with `^` in multiline mode. If it is not a text editor, add `(?m)` to the start of the regex. See https://regex101.com/r/1Ci5yD/1 – Wiktor Stribiżew Apr 16 '19 at 09:00
  • Your answer with adding (?m) to the start has accomplished what I needed! This is the missing part of the puzzle! Place this as an answer, and I'll mark it as the accepted one. Thank you. – Joe Apr 16 '19 at 09:08

1 Answers1

1

You may use a ^ with multiline mode on:

(?m)^(?!Accounts)(?:.*\n?){2}

Or (a bit more efficient and following best practices):

(?m)^(?!Accounts).*(?:\n.*)?

See the regex demo and this regex demo.

When (?m) is added to the pattern, ^ matches start of a line, and the whole pattern matches

  • ^ - start of a line
  • (?!Accounts) - with no Accounts as the first word
  • (?:.*\n?){2} - two occurrences of any 0+ chars other than line break chars followed with an optional newline
  • .*(?:\n.*)? - matches a line and an optional subsequent line.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563