-1

I'm trying to parse some parts of an HTML page but I have problems with my regular expression. My code looks like this:

... Download page using wget and some other stuff ...

$PAGE_REGEXP = "\<div class="col bg_dark clear">";

#Array HTMLLines
@HTMLLines = split(/\n/, $Page);
foreach $ThisOne (@HTMLLines) {
    if ( ($Team) = ($ThisOne =~ /$PAGE_REGEXP/) ) {
        $T{TranslateTeams($Team)}++;
        $LastTeam=TranslateTeams($Team);
    };
};

This is the HTML page:

<div class="col bg_dark clear">
    <div class="col_1 left">15:30</div>
    <div class="col_3_archive left">Team A - Team B</div>
    <div class="col_2_archive left">
            1:4 (0:2)&nbsp;
    </div>

    <div class="col_5 left ">2.4&nbsp;</div>
    <div class="col_5 left ">3.6&nbsp;</div>
    <div class="col_5 left bold">2.9&nbsp;</div>
    <div class="col_8 left">
</div>

<div class="col  clear">
    <div class="col_1 left">15:30</div>
    <div class="col_3_archive left">Team C - Team D</div>
    <div class="col_2_archive left">
            2:3 (1:1)&nbsp;
    </div>

    <div class="col_5 left ">2.7&nbsp;</div>
    <div class="col_5 left ">3.7&nbsp;</div>
    <div class="col_5 left bold">2.5&nbsp;</div>
    <div class="col_8 left">
</div>

The informations I need to parse are the team names, the end and halftime result and the numbers in e.g., col_5_left: 2.4, 3.6 and 2.9(for the game Team A - Team B).

If I start my script, Perl gives me following error: Bareword found where operator expected at parser.pl line 11, near ""\

I'm not familiar with all existing modules in Perl, maybe I'm trying to do something which is quite easily to achieve using the correct module. Can anybody please provide me some hints/tips how to parse this HTML page?

Thx

arge
  • 635
  • 1
  • 7
  • 16

1 Answers1

1

The line with regexp should probably look something like this:

$PAGE_REGEXP = '<div class="col bg_dark clear">';
szeryf
  • 3,197
  • 3
  • 27
  • 28
  • That is, you were not using quotes properly -- trying to use the double quote character inside a double quoted string -- in your original expression. – mob May 17 '12 at 19:19