1

I have SQL code like:

--START
select * from table;
--END

I need to get everything between --START and --END. What I came up with is

/--START(?s)(.*)--END/

or

`/--START(\r\n|\r|\n)(?s)(.*)--END/`

the first one works fine but adds one empty line before the select. So in the end I get

empty line
select * from table;

What I need to do is to get rid of this one empty line in the beginning. Would anybody help me how to get there: find everything between --START + newline and '--END'?

I tried Match linebreaks - \n or \r\n? but answers there didn't work for me.

JanFi86
  • 449
  • 10
  • 29
  • 2
    Use `/--START\s*(.*?)\s*--END/`, see https://regex101.com/r/i5e5vp/1. Or a more specific `/--START\n(.*?)\n--END/`. Do you always expect a single line between the two strings? – Wiktor Stribiżew Sep 04 '19 at 09:55
  • 2
    To add to Victor's comment, you should use DOT ALL mode in your regex tool, assuming there could be more than one line in between `START` and `END`. – Tim Biegeleisen Sep 04 '19 at 09:56
  • What is the programming language? Do you need to match multiline blocks? What kind of line endings do you need to support? UNIX, MacOS, Windows or all? Do you need any leading/trailing whitespace in the result? – Wiktor Stribiżew Sep 04 '19 at 10:01
  • @Wiktor Stribiżew language is perl, system is Win and UNIX and I do not need any whitespaces – JanFi86 Sep 04 '19 at 10:04

1 Answers1

1

In general, you may use a pattern like

/--START\s*(.*?)\s*--END/s

See the regex demo. \s* will match any 0+ whitespaces, but it won't require line breaks after --START and before --END.

A bit more specific pattern will be

/--START\h*\R\s*(.*?)\h*\R\s*--END/s

Or, if the --START and --END should appear at the start of lines, add anchors and m modifier:

/^--START\h*\R\s*(.*?)\h*\R\s*--END$/sm

See the regex demo and another regex demo.

Details

  • ^ - start of a line (since m modifier is used)
  • --START - left-hand delimiter
  • \h* - 0+ horizontal whitespaces
  • \R - a line break
  • \s* - 0+ whitespaces
  • (.*?) - Group 1: any 0+ chars, as few as possible
  • \h* - 0+ horizontal whitespaces
  • \R - a linebreak
  • \s* - 0+ whitespaces
  • --END - the right-hand delimiter
  • $ - end of a line.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563