I am having real problems trying to extract the text between a HTML header tag. I have the following Perl script which I am using to test:
#!/usr/bin/perl
my $text = '<html xmlns:v=3D"urn:schemas-microsoft-com:vml" xmlns:o=3D"urn:schemas- micr=osoft-com:office:office" xmlns:w=3D"urn:schemas-microsoft-com:office:word" =xmlns:m=3D"http://schemas.microsoft.com/office/2004/12/omml" xmlns=3D"http:=//www.w3.org /TR/REC-html40"><head><META HTTP-EQUIV=3D"Content-Type" CONTENT==3D"text/html; charset=3Dus-ascii"><meta name=3DGenerator content=3D"Micros=oft Word 14 (filtered medium)">This is a test</HTML>';
my $html = "Add this first";
$text =~ /(<html .*>)(.*)/i;
print $text . "\n";
What I need to achieve is that the text between between the is extracted into $1 and what is left into $2. Then I can add in my text using print $1$myhtml$2
I just cannot get it to work :(