0

When I press 'View source code' of a certain web page, it's kind of like this:

<form action="/WANem/index-advanced.php" method="post">
<table border="1" width="100%">
  <tr>
    <td width="10%" >Delay time(ms) </td>
    <td width="10%" ><input type="text" name="txtDelay1" size="7" value=1200>
  </td>
  <input type="submit" value="Apply settings" name="btnApply">  
</table>
</form>

My question is: How can i get '1200' in the code using PHP. I mean i just want to get a certain string in the html code of another website without having to press 'view source code' and copy that string. Thanks for any reply.

Tran Ngu Dang
  • 2,540
  • 6
  • 29
  • 38
  • Is the html on your site? Or is this an external site that you are trying to pull from? – bozdoz Nov 07 '11 at 18:14
  • This is a Web interface of a program run from a boot CD called WAN Emulator. There's no way to read and modify PHP code from that Live CD( iso file) – Tran Ngu Dang Nov 08 '11 at 01:16

3 Answers3

1

What you're trying to do is called "web scraping".

Here's a StackOverflow question with a bunch of helpful answers:

How to implement a web scraper in PHP?

And here is a tutorial that probably explains it better than I could by typing it out here:

http://www.thefutureoftheweb.com/blog/web-scrape-with-php-tutorial

Hope it helps and good luck!

Community
  • 1
  • 1
Teekin
  • 12,581
  • 15
  • 55
  • 67
0

In your php file it is something as simple as this:

$value = $_POST['txtDelay1'];

Although, it looks as a really basic question. I suggest you to go through some tutorials, to get the idea on how it all works.

First on in google php form tutorial: http://www.phpf1.com/tutorial/php-form.html

EDIT:

Oh, now i see your edits. In that case, you can't skip sending a http request to get the source code, just like a browser does. Next, you have to parse the response from the server, just like a browser does as well. Ah, The response will be the "Source code" you're asking for. If you can, consider using python to this. It will be much more faster and efficient.

If PHP is a must, be aware that this task is a pain in the ass;)

Moyshe
  • 1,122
  • 1
  • 11
  • 19
  • 1
    I dunno, I find this easy in pretty much any language, including PHP. Fetch the doc, locate the string, copy it out... seems simple enough for any language with HTTP-support, proper string functions and/or regex. – Teekin Nov 07 '11 at 18:28
0

You can do this with file_get_contents() and preg_match().

//$url is whatever your URL is
$url = file_get_contents($url);
preg_match('|name="txtDelay1".*?value=([\d]+)|', $html, $html);
echo $html[1];
//should print your value 1200

Check out the regex here.

However

This is only going to work as long as this code appears exactly and is not duplicated. Also, if you are scraping from another site, it could be changed by the owner, and the regex would no longer work.

bozdoz
  • 12,550
  • 7
  • 67
  • 96
  • 1
    No this is my website!I just can't see the source code of it! Thank you very much! I will try this immediatedly – Tran Ngu Dang Nov 08 '11 at 01:20
  • So this html is on your website? I do not understand. – bozdoz Nov 08 '11 at 01:22
  • I'm working on Networking field. There're a bunch of servers can be accessed only via the web interface. And I can view and use it but can't modify the source code of it. WAN emulator is an example:http://wanem.sourceforge.net/ – Tran Ngu Dang Nov 08 '11 at 02:19