0

I have the variable:

$line = "print      var1,   var2, var3";

var1 will always be present but the other var's may not be.

I want to extract var1 and any other var's that may appear.

I am currently using the following:

$line = "print      var1,   var2, var3";

if ($line =~ /\s*print\s*([A-Za-z0-9]+)(?=\s*,\s*([A-Za-z0-9]+))/){
    print "$1\n";

    while ($line =~ /\s*print\s*([A-Za-z0-9]+)(?=\s*,\s*([A-Za-z0-9]+))/g){
        print "$2\n";
}

Not sure if I have over complicated this.. but the result of this is simply:

var1
var2

Instead of:

var1
var2
var3

Anyone know how I can achieve this?

Jerry
  • 70,495
  • 13
  • 100
  • 144
  • You could try this `^print\s*(*SKIP)(*F)|(\w+)` also.. http://regex101.com/r/pD5sV6/13 To work only on the lines which has the string `print` then use `(?:^print\s*|^(?:(?!print).)*)(*SKIP)(*F)|(\w+)` see http://regex101.com/r/pD5sV6/16 – Avinash Raj Sep 20 '14 at 07:09

3 Answers3

1

Keep it simple.

Determine the line that you want to parse using a regex, and then separate the values using split:

use strict;
use warnings;

while ( my $line = <DATA> ) {
    if ( $line =~ /^print\s+(.*)/ ) {
        my @vars = split /,\s*/, $1;
        print "@vars\n";
    }
}

__DATA__
print      var1,   var2, var3

Outputs:

var1 var2 var3
Miller
  • 34,962
  • 4
  • 39
  • 60
1

The problem with your current regex is that you only have two capture groups. The simplest way would be to use as many capture groups as variables, but that isn't practical. The next simplest would be to match all the variables and then split on comma.

But if you still want to do it in one regex, then you can use the \G anchor which matches at the beginning of a line or at the end of a previous match. It can be a bit complex to understand (you can read more here), but here is how to put use it to get what you are looking for:

$line = "print      var1,   var2, var3";

if ($line =~ /\s*print\s*([A-Za-z0-9]+)(?=\s*,\s*([A-Za-z0-9]+))/){
    print "$1\n";

    @lines = $line =~ /\s*(?:print|(?!^)\G)\s+([A-Za-z0-9]+)(?:,|$)/g;
    $result = join("\n", @lines);
    print "$result\n"
}

This regex basically gets all the variables in a single match.

Community
  • 1
  • 1
Jerry
  • 70,495
  • 13
  • 100
  • 144
0
(?=(var\d+))

Use this.This will give all var's .

Your regex will not work with all cases.It searches print var which is followed by a var.So it will not work in case of print var as there is not var ahead of it.

See demo.

http://regex101.com/r/pD5sV6/12

vks
  • 67,027
  • 10
  • 91
  • 124