1

I'm new to Perl and im trying to extract the text between all <li> </li> tags in a string and assign them into an array using regex or split/join.

e.g.

my $string = "<ul>
                  <li>hello</li>
                  <li>there</li>
                  <li>everyone</li>
              </ul>";

So that this code...

foreach $value(@array){
    print "$value\n";
}

...results in this output:

hello
there
everyone
tshepang
  • 12,111
  • 21
  • 91
  • 136
user2809023
  • 21
  • 1
  • 1
  • 2
  • 2
    It is not a good idea to use regex for HTML. See [this answer](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags#1732454) – Jim Garrison Sep 23 '13 at 23:47
  • Yes, regex is a horribly wrong tool for this. – hlovdal Sep 23 '13 at 23:50
  • regex is not a horrible tool, if it fits you're need then use it, probably faster then HTML parser. With HTML parser you know its valid HTML and you can walk through the tree. – lordkain Sep 24 '13 at 03:55
  • 1
    Yes, I think you are being too harsh on the OP. S/he is not asking for a complex html parser, but something reasonable. Just need to split the string on `\n` and search for something like either `
  • (.+?)
  • ` or `
  • ([^<])`. I would answer but I have tried too hard to forget PERL.
  • – beroe Sep 24 '13 at 04:32