-4

I need the script to come to defined web address, and then exctract from this part of html code which is present multiple times on the web all title values. This is just one example from the site:

<td><a title="Articlesiteslist.com Analysis" href="http://www.statscrop.com/www/articlesiteslist.com"><img src="http://static.statscrop.com/favicons.png" class="data_original img_icon" data-original="http://s2.googleusercontent.com/s2/favicons?domain_url=articlesiteslist.com" width="16" height="16" alt="articlesiteslist.com" title="articlesiteslist.com"> articlesiteslist.com</a></td>

Tomorrow

From this I need only the title, so from title="example" only example value should come out.

Thanks a lot for help, trying to solve this problem for two days now.

user3281831
  • 839
  • 5
  • 14
  • 24
  • 1
    Load the HTML with PHP's built-in DOM parser, and do: `foreach ($dom->getElementsByTagName('a') as $tag) { echo $tag->getAttribute('title'), '
    '; }`.
    – Amal Murali Feb 07 '14 at 14:04
  • How to do that? how would the whole script look than? I'm sorry i do not know almost anything about PHP. – user3281831 Feb 07 '14 at 14:43

1 Answers1

0

To expand the idea of Amal Murali you need to make the following.

For example you want to load some "a.html" file:

<html>
<body>
Lorem ipsum dolor
<a title="Ravellavegas.com Analysis" href="http://somewebsite.com/" />
sit amet, consectetur adipisicing elit, sed do eiusmod tempor
<a title="Articlesiteslist.com Analysis" href="http://someanotherwebsite.com/" />
incididunt ut labore et dolore magna aliqua.
</body>
</html>

Then, you have to write the script as follows:

<?php

$dom = new DOMDocument();
$dom->load('a.html');

foreach ($dom->getElementsByTagName('a') as $tag) {
    echo $tag->getAttribute('title').'<br/>';
}

?>

This outputs:

Ravellavegas.com Analysis
Articlesiteslist.com Analysis

Variant #2

<?php
$text = <<<EOT
<html>
<body>
Lorem ipsum dolor
<a title="Ravellavegas.com Analysis" href="http://somewebsite.com/" />
sit amet, consectetur adipisicing elit, sed do eiusmod tempor
<a title="Articlesiteslist.com Analysis" href="http://someanotherwebsite.com/" />
incididunt ut labore et dolore magna aliqua.
</body>
</html>
EOT;

preg_match_all('/title=".*?"/is', $text, $matches);
foreach($matches[0] as $m)
{
    $m = str_replace('title="', "", $m);
    $m = str_replace('"', '', $m);
    echo htmlentities($m)."<br />";
}
?>

This still outputs:

Ravellavegas.com Analysis
Articlesiteslist.com Analysis
Dmytro Dzyubak
  • 1,562
  • 2
  • 22
  • 35