We are trying to parse HTML like this:
<li><a class="newsMarquee" href="http://www.lebanonfiles.com/news/617843">مستخدمو "كهرباء لبنان": الاضراب مستمر حتى إقرار موازنة 2013 الخاصة بنا</a></li>
<li><a class="newsMarquee" href="http://www.lebanonfiles.com/news/617840">اجتماع برئاسة محافظ الجنوب بحث في اوضاع النازحين</a></li>
We are getting this as result:
ÃÑÚÃÉ ÇááÌÇä ÃÑÓÊ ËáÇËÉ ãÔÇÑÃÚ ÈÃÆÃÉ ãÓÊÎÃãæ "ßåÑÈÇà áÈäÇä": ÇáÇÖÑÇÈ ãÓÊãÑ ÃÊì ÅÞÑÇÑ ãæÇÒäÉ 2013 ÇáÎÇÕÉ 銂
And we have used: header("Content-Type: text/html; charset=utf-8");
Any Suggestions?
This is the Code:
<?php
echo '<html><head>';
header("Content-Type: text/html; charset=utf-8");
echo '</head>';
echo '<body>';
$dom = new DOMDocument('1.0');
@$dom->loadHTMLFile($url);
$params = $dom->getElementsByTagName('div'); // Find Sections
$k=0;
foreach ($params as $param) //go to each Article 1 by 1
{
if($params->item($k)->getAttribute('class') == 'no-js')
{
$params2 = $params->item($k)->getElementsByTagName('a');
$i=0;
while($params2->item($i)->getAttribute('class') == 'newsMarquee')
{
if($params2->item($i)->getAttribute('class') != 'newsMarquee')
break;
else
{
echo '' .$params2->item($i)->nodeValue. '<br/>';
//echo 'Link: '.$params2->item($i)->getAttribute('href').'<br/><br/>';
$i++;
}
}
}
$k++;
}
echo '</body>';
echo '</html>';
?>