0

I get lines from the text file and then need to split them into words. So eveything in single or double quotes should be ignored. For example: use line; "$var", print 'comment': "get 'comment % two'" should be inserted in an array as use, line, print . All other just ignored. Also I need to check if % sitting inside single or double quotes (like in the above example)

my @array = $file_line =~ /[\$A-z_]{2,}/g; gives all the words (plus anything that contains $) but I can't not to ignore characters in the quotes

Any ideas?

Thanks

Max_S
  • 135
  • 1
  • 1
  • 9
  • possible duplicate of [Regex for splitting a string using space when not surrounded by single or double quotes](http://stackoverflow.com/questions/366202/regex-for-splitting-a-string-using-space-when-not-surrounded-by-single-or-double) – CrayonViolent Mar 13 '14 at 02:34

2 Answers2

1

I agree with the answer that you can first remove the quoted words using

$line =~ s/ ( ["'] ) .*? \1 //xg;

However, you should be aware that your regular expression

[\$A-z_]

picks up all the ASCII characters between 'A' and 'z', in particular, the following punctuation characters:

[ \ ] ^ _ `

So you should either be more explicit in your regular expression

[\$A-Za-z_]

or you should add the case-insensitive flag "i" to your substitution and just use one case in the regular expression:

$file_line =~ /[\$A-Z_]{2,}/gi;
Mark Nodine
  • 163
  • 7
0

You can first remove all the quoted words, for example using:

$line =~ s/ ( ["'] ) .*? \1 //xg;

You might want to slightly change it depending on how you want to handle nested quotes, unclosed quotes etc.

ggoossen
  • 1
  • 2