-2

I am using regular expression to parse XML file (though regexp is not recommended for xml parsing, but i have to use regexp, no other go).

My doubt is how to skip commented lines in XML file, while parsing using Perl.

I want Perl to parse XML file, while skipping commented lines.

Can anyone help me, please.

Thanks Senthil .

JohnB
  • 18,046
  • 16
  • 98
  • 110
Senthil kumar
  • 965
  • 3
  • 16
  • 33

4 Answers4

4

As bad as this question is for many people, many answers to it are just as bad: use an XML parser, here's why, end of the discussion.

For me, the whole point of asking a question on stackoverflow is to obtain a solution. Have we provided a solution to OP? Not quite.

A more complete answer would offer some examples on how to parse xml. Here are some;

Can you provide an example of parsing HTML with your favorite parser?

Community
  • 1
  • 1
Philippe A.
  • 2,885
  • 2
  • 28
  • 37
3

If your problem is compiling XML libraries, you can try XML::Parser::Lite or XML::Parser::PurePerl which are pure perl modules requiring no compilation.

Or, you might be able to find pre-compiled packages of the non-pure-perl libraries. What OS are you on?

runrig
  • 6,486
  • 2
  • 27
  • 44
  • MKDoc::XML is another lightweight pure-perl XML parser which, amusingly enough, uses a monster regex as a tokenizer -- but it's the *right* regex. – hobbs Jul 31 '10 at 05:35
2

Please, do not parse XML with regular expressions, use XML parser instead.

At least you can write a simple finite-state machine based parser to process your XML. It's very simple to do it.

Community
  • 1
  • 1
Daniel O'Hara
  • 13,307
  • 3
  • 46
  • 68
  • 2
    The OP is aware of it, but doesn't have another option – NullUserException Jul 30 '10 at 13:33
  • 1
    to quote op "I am using regular expression to parse XML file (though regexp is not recommended for xml parsing, but i have to use regexp, no other go)." This answer is not useful as he already knows it... bigger question is why he can't use a parser. – xenoterracide Jul 30 '10 at 13:33
  • I understand this, but I don't understand why he *have to* use regexp. – Daniel O'Hara Jul 30 '10 at 13:40
  • 1
    +1 for adding the second paragraph... but i'm with all the others who say "explain why not A CPan parser". – DVK Jul 30 '10 at 14:15
  • 4
    Unless it's a homework problem, the OP *does* have other options – derby Jul 30 '10 at 16:15
1

One way to do it is to strip commented lines prior to parsing.

$string =~ s/<!--.*?-->//gs;
NullUserException
  • 83,810
  • 28
  • 209
  • 234