1

using file_get_contents, I open an Internet URL and get the contents of this webpage.

Inside the HTML there are many identical span class tags:

<span class="always-the-same-class">always dynamic text</span>

Now, I want to get an array containing all the "dynamic text" contained in any of this tags. It is not necessary to eliminate duplicated entries (I need them).

Is this possible? How could I do?

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
  • Based in the fact you want to work with html components, I can't see a use for PHP, only JavaScript. Use jQuery .each() function, it will help, and then you send this information to php –  Jan 08 '12 at 17:44
  • http://simplehtmldom.sourceforge.net/ here you will find all such problems with solutions.. – Rajat Singhal Jan 08 '12 at 17:45
  • @Euger Rieck: isn't preg_match to find all occurrencies of a given string? If so, I don't need only this: I need to find the text after that string too. –  Jan 08 '12 at 17:48
  • @Gerep: JavaScript is not useful in my context: the page is opened by a cronjob+curl and the browser never gets opened. –  Jan 08 '12 at 17:49
  • Please, add that information on your question ;) –  Jan 08 '12 at 17:52
  • @RajatSinghal: this is interesting. –  Jan 08 '12 at 17:52
  • 1
    @Gerep: I asked a PHP question! :-) –  Jan 08 '12 at 17:52

3 Answers3

2

If I understood correctly, this has to be PHP as it is on the server, not in the browser. So I'd do something like

$html=file_get_contents(HTML_URL);
$a=preg_match_all("/\<span class\=\"always-the-same-class\"\>(.*?)\<\/span\>/",$html,$b);
echo $a;
print_r($b[1]);

$a has hit count, $b[1] the hits

Tested this against

<html>
.. blah ..
<body>

.. blah ..

<span class="always-the-same-class">always dynamic text A</span>
<span class="always-the-same-class">always dynamic text B</span>
<span class="always-the-same-class">always dynamic text C</span>

.. blah ..

</body>
</html>

and output was

3
Array
(
    [0] => always dynamic text A
    [1] => always dynamic text B
    [2] => always dynamic text C
)
Eugen Rieck
  • 64,175
  • 10
  • 70
  • 92
0

jquery:

var spanText = $('.always-the-same-class').text();
Alex
  • 7,538
  • 23
  • 84
  • 152
  • 1
    I don't need JQuery and JavaScript! I posted a PHP question! –  Jan 08 '12 at 17:51
  • I see 2 ways, 1: a preg_match 2: do function that will run the jquery function and that will return it's value and than you grab it and store it in a variable :), all of this inside php – Alex Jan 08 '12 at 17:55
  • JavaScript is not useful in my context: the page is opened by a cronjob+curl and the browser never gets opened –  Jan 08 '12 at 17:57
  • Useful or not, trust me... it would do its thing, and I am saying this because I did it. But anyways, I do prefer @Eugen Rieck's answer – Alex Jan 08 '12 at 18:06
  • @wOrldart: the browser is not opened, I can't change my context, there's simply no client side, please accept it :-D –  Jan 08 '12 at 18:13
  • oh lol, you are right... now I have realized the no browser... damn it these long nights :D – Alex Jan 08 '12 at 18:17
0

You can parse this content using the DOMDocument class that is provided in PHP. Once you load the content into the dom document you can then filter out the span tags by using $content->getElementsByTagName('span'); Once you have done this then you can filter the results by the tags attributes and get the content.

Calebj
  • 142
  • 9