match text followed by multiple line breaks with spaces

Question

I would like to match a text (numbers, strings, special chars, spaces, one line break ...) followed by at least two line breaks(every line starts with a space then a line break). At the moment I am only able to match the multiple line breaks, but I want to match the text before.. I am using this regular expression: \n+\s*\n+ this is my input:

        Test Test TestTester TestTestt                              Test Test TestTestTestTest: 29724 @erq
        Test Test we                                Test Test, iuow, 0202220
        Test Test  962ert64






                             Test Test TestTestTestTest Test Test TestTestTestTest Test Test TestTestTestTest Test Test TestTestTestTest Test Test TestTestTestTest Test Test TestTestTestTest Test Test TestTestTestTest Test Test TestTestTestTest Test Test TestTestTestTest Test Test TestTestTestTest Test Test TestTestTestTest 
                                      Test Test TestTestTestTest Test Test TestTestTestTest Test Test TestTestTestTest 
                                      Test Test TestTestTestTest Test Test TestTestTestTest Test Test TestTestTestTest 
Test Test TestTestTestTest Test Test TestTestTestTest Test Test TestTestTestTest Test Test TestTestTestTest Test Test TestTestTestTest Test Test TestTestTestTest

the output should be :

Test Test TestTester TestTestt                              Test Test TestTestTestTest: 29724 @erq
        Test Test we                                Test Test, iuow, 0202220
        Test Test  962ert64

Something like `if (preg_match('~^(.*?)\R{3,}~s', $s, $match)) { echo trim($match[1]); }`, see [demo](https://regex101.com/r/iShVyq/1). — Wiktor Stribiżew, Apr 11 '19 at 14:05
Thank you @WiktorStribiżew but this doe not work for me, as the line breaks after the text always start with a space (See the post, I just have updated it), and I can only use the global flag. — Mana, Apr 11 '19 at 14:21

score 2 · Accepted Answer · edited Apr 12 '19 at 18:25

2

This one should help:

$re = '/(.+\n)\n\s*\n/sU';
preg_match($re, $str, $matches, PREG_OFFSET_CAPTURE, 0);

The flags s and U are really important here!

s means that . will match newlines, and U will make the quantifiers ungreedy (lazy).

And here is a working example: https://regex101.com/r/G0KS3g/1

UPD: If you can't use flags, try this one:

([\S\s]*?)\n\s*\n

Here we have a lazy quantifier *?, and [\S\s] matches any character except a newline . OR a newline \n.

However, the regex dialect of your software might bring more limitations.

edited Apr 12 '19 at 18:25

Wiktor Stribiżew

607,720
39
448
563

answered Apr 11 '19 at 14:22

Ildar Akhmetov

1,331
13
22

Can I just have a regular expression with no flags ? or just "global flag"? – Mana Apr 11 '19 at 14:49
I am not injecting this regular expression in a code, I am using a drag and drop software, that only accepts regular expressions without flags.. – Mana Apr 11 '19 at 14:55
Updated the answer, added an option that doesn't use any flags. – Ildar Akhmetov Apr 11 '19 at 15:04
Never use `(.|\n)*?`, please replace with `.*?` and add `/s` modifier. – Wiktor Stribiżew Apr 11 '19 at 18:31
1

I really mean it, never use `(.|\n)`, NEVER. – Wiktor Stribiżew Apr 12 '19 at 18:25
Thanks, Wiktor! Maybe you can share a Stackoverflow post with a detailed explanation on that? – Ildar Akhmetov Apr 13 '19 at 06:55
Can you please explain why (.|\n) should not be used – Mana Apr 16 '19 at 10:23
@Mana Discussed it today here: https://stackoverflow.com/questions/55703561/why-using-n-is-a-bad-idea – Ildar Akhmetov Apr 16 '19 at 10:25

match text followed by multiple line breaks with spaces

1 Answers1