perl get the first matched word in a line

Question

Sorry im new in perl and cannot find a similar answer.

html file

<div class="user_rating">
.
.
<span class="genre">
.
.
.
<span class="genre">
.
.
.
<span class="genre">
.
.
.
<span class="genre">

perl file

$content =~ /<div class="user_rating">(.*)<span class="genre">/gs;
$empty = $1;

this $empty variable contains information from <div class="user_rating"> to the last <span class="genre">.

But i just want the information from <div class="user_rating"> to the first <span class="genre">. how should i modify my code? i know it is a regular expression problem.

Any help plz...

If you are going to do a lot of HTML parsing, look into something like `HTML::TreeBuilder` (http://search.cpan.org/~cjm/HTML-Tree-5.03/lib/HTML/TreeBuilder.pm), which will parse the HTML for you. A regex is certainly a useful quick-and-dirty solution for tasks like this, but it is not a robust way of processing HTML in general. — dan1111, Oct 23 '12 at 09:07
[Don't try to parse HTML with regexps](http://stackoverflow.com/a/1732454/470535) yourself, use a [HTML parser](http://search.cpan.org/dist/HTML-Parser/Parser.pm) instead. — dgw, Oct 23 '12 at 09:09

score 4 · Accepted Answer · edited Feb 04 '16 at 08:22

4

Modify your regexp, because .* is greedy.

$content =~ /<div class="user_rating">(.*?)(<span class="genre">){1}/gs;

edited Feb 04 '16 at 08:22

zb226

9,586
6
49
79

answered Oct 23 '12 at 08:59

Pavel Vlasov

3,455
21
21

3

@user1767718 Welcome on SO! If this answer worked for you, you may want to *accept* it as well. But also consider the parser hints in the question's comments :) – memowe Oct 23 '12 at 09:53

perl get the first matched word in a line

1 Answers1