6

I have a passage of verses and they are numbered. I want each numbered verse in separate line so I add a new line before them but I have some parenthesis that also have numbers. They too are replaced by new lines. I don't want to match the numbers inside parenthesis. I used

$_=~s/(\d+)/\n$1 /gs;

with this input:

1Hello2Hai (in 2:3) 3hi 4 bye

but it replaces the numbers inside paranthesis also.

Required output :

1 Hello
2 Hai (in 2:3)
3 hi
4 bye

Actual output:

1 Hello
2 Hai (in
2:
3)
3 hi
4 bye

How do I construct the regex so that it doesn't match inside parenthesis. I use perl for the regex.

Cœur
  • 37,241
  • 25
  • 195
  • 267
xtreak
  • 1,376
  • 18
  • 42

2 Answers2

4

You can try this:

#!/usr/bin/perl 
use strict;
use warnings;

my $stro = <<'END';
1Hello2Hai (in 2:3) 3hi 4 bye
END

$stro =~s/(\((?>[^()]++|(?1))*\))(*SKIP)(*FAIL)|\s*(\d+)\s*/\n$2 /g;

print $stro;

pattern details:

The idea is to skip content in parenthesis. To do that I try to match parenthesis first with this recursive subpattern: (\((?>[^()]++|(?1))*\)) and I make the subpattern fail and force the regex engine to not retry the substring with an other alternative with (*SKIP) and (*FAIL) backtracking control verbs.

(*SKIP) forces to not retry the content matched on his left if the subpattern will fail later.

(*FAIL) forces the subpattern to fail.

An other way:

As you can read in the perl documentation, backtracking control verbs are an experimental regex feature and should be mentioned in a production code. (However, this feature exists for several years.)

Here is a simple way without these features: You match all that precedes a number and you remove it from the match result with the \K feature:

s/(?:(\((?>[^()]++|(?1))*\))|[^\d(]+)*\K\s*(\d+)\s*/\n$2 /g
Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125
  • Did u check the output for the test case I provided? I get an error here @Casimir . Sorry I am new to perl what does *SKIP and *FAIL mean? – xtreak Dec 14 '13 at 13:10
  • Sorry I had both perl script and a text file in the same name and with perl it had run the text file. I will mark it as answer. Thanks a lot. Please provide me a link to learn about skip and fail @Casimir – xtreak Dec 14 '13 at 13:20
  • @xtreak: you can find informations about this feature by following the link in my answer or by reading this fabulous answer: http://stackoverflow.com/questions/19992984/verbs-that-act-after-backtracking-and-failure :) – Casimir et Hippolyte Dec 14 '13 at 13:38
1

use this pattern
(\D+)(\d+)(?=((?!\)).)*\(|[^()]*$) with /g option
and replace with $1\n$2 Demo

or to adjust the indentation use this pattern
(\d+)\s*(?=((?!\)).)*\(|[^()]*$) with /g option
and replace with \n$1 Demo
except you have to get rid of the first blank line

alpha bravo
  • 7,838
  • 1
  • 19
  • 23