3

I want to find all characters between 2 special characters. I can't find the solution though because there are new lines that are not included. It's prolly easy, but I can't seem to find the right regex for it.

How do I solve this problem?

The source data is structured like this:

\#(.*)\; 

doesn't include new lines and

(?!\#)([\S\s])(?!=\;) 

doesn't work also.

It selects everything, but doesn't do the group trick...

Source looks like this:

#first line of text;
#second line of text;
#third line could easy 
be on a new line;
#forth etc;
#this could (#hi,#hi,#hi) also 
happen though:));
#so.... any idea;

any new line starts with # and every line ends with ;

Emma
  • 27,428
  • 11
  • 44
  • 69
ronald
  • 41
  • 2

3 Answers3

1

I see two problems in your regex,

  • You are missing quantifier in your [\S\s] due to which it will only match one character.
  • Second you need a non-greedy regex so it doesn't match all the lines.

Also, where you wrote this (?!#) I guess you meant to write any one character among them, for which you should place it in a character set like this [?!#]

You need this regex, where you can capture your text from group1

#([\w\W]*?);

Regex Demo

And like you attempted, if you want your full match to only select the intended text, you can use lookaround.

Regex Demo with lookarounds so your full match is intended text only

Also, writing [^;]* (which also matches newlines) is way faster than .*? hence you should preferably use this regex,

(?<=[?!#])[^;]*(?=;)

Regex Demo with best performance

Pushpesh Kumar Rajwanshi
  • 18,127
  • 2
  • 19
  • 36
1

You just need to modify your first regex a little bit so that it looks like this:

#([\s\S]*?);
  • . will only match non new line characters. So I replaced it with [\s\S] - the set of whitespaces union the set of non-whitespaces - the set of all characters. If your regex engine has the "single line" option, you can turn that on, and . will match new lines as well.

  • I also made * lazy. Otherwise it will just be one whole match that matches all the way to the last ;. For more info, see this question.

  • You don't need to escape the ;.

Sweeper
  • 213,210
  • 22
  • 193
  • 313
0

You have to use either a single line flag /s or add whitespace characters \s as second alternative to all characters .. Also, your * quantifier must be lazy/non-greedy, so the whole regex stops at first ; it founds.

#((?:.|\s)*?); or #(.*?);/s
Egan Wolf
  • 3,533
  • 1
  • 14
  • 29