0

I'm trying to understand why the following does not give me what I think (or want :)) should be returned:

sed -r 's/^(.*?)(Some text)?(.*)$/\2/' list_of_values

or Perl:

perl -lpe 's/^(.*?)(Some text)?(.*)$/$2/' list_of_values

So I want my result to be just the Some text, otherwise (meaning if there was nothing captured in $2) then it should just be EMPTY.

I did notice that with perl it does work if Some text is at the start of the line/string (which baffles me...). (Also noticed that removing ^ and $ has no effect)

Basically, I'm trying to get what grep would return with the --only-matching option as discussed here. Only I want/need to use sub/replace in the regex.

EDITED (added sample data)

Sample input:

$ cat -n list_of_values
     1  Black
     2  Blue
     3  Brown
     4  Dial Color
     5  Fabric
     6  Leather and Some text after that ....
     7  Pearl Color
     8  Stainless Steel
     9  White
    10  White Mother-of-Pearl Some text stuff

Desired output:

$ perl -ple '$_ = /(Some text)/ ? $1 : ""' list_of_values | cat -n
     1
     2
     3
     4
     5
     6  Some text
     7
     8
     9
    10  Some text
Community
  • 1
  • 1
lzc
  • 919
  • 7
  • 16

1 Answers1

4

First of all, this shows how to duplicate grep -o using Perl.


You're asking why

foo Some text bar
012345678901234567

results in just a empty string instead of

Some text

Well,

  • At position 0, ^ matches 0 characters.
  • At position 0, (.*?) matches 0 characters.
  • At position 0, (Some text)? matches 0 characters.
  • At position 0, (.*) matches 17 characters.
  • At position 17, $ matches 0 characters.
  • Match succeeds.

You could use

s{^ .*? (?: (Some[ ]text) .* | $ )}{ $1 // "" }exs;

or

s{^ .*? (?: (Some[ ]text) .* | $ )}{$1}xs;     # Warns if warnings are on.

Far simpler:

$_ = /(Some text)/ ? $1 : "";

I question your use of -p. Are you sure you want a line of output for each line of input? It seems to me you'd rather have

perl -nle'print $1 if /(Some text)/'
Community
  • 1
  • 1
ikegami
  • 367,544
  • 15
  • 269
  • 518
  • The answer/trick was about making proper use of non-capturing groups, that did it for me! Thanks. I just wonder why are you using the `/s` modifier ? – lzc Apr 06 '17 at 16:55
  • Why would I want to check if the characters are newlines? – ikegami Apr 06 '17 at 17:13