0
<?php
include_once("simple_html_dom.php");
$url= $_GET['txt'];
$html = file_get_contents($url);
$file = file_put_contents("testhtml.txt",$html);
$html = file_get_html("testhtml.txt");
$fileopen=fopen("testhtml.txt",'r');
?>

Hello, I am using this code to extract HTML from a website. I'm getting the HTML that is saved in testhtml.txt. The problem is that I want to analyze the HTML open and close tags, nesting and unsupported tags for HTML? Which PHP function may I use apart from regular expression?? Please help me.

fairy
  • 9
  • 2
  • Can someone help me of what php function may use to extract html unsupported tags and nested tags?? – fairy just now edit – fairy Mar 22 '14 at 09:56

2 Answers2

0

HTML is based on DOM structure, so in my opinion DOM functions should help: http://php.net/dom

David
  • 746
  • 6
  • 18
  • Thanks actually i was able to extract html from any website but the problem is that I'm not able to get any php function of regular expression to analyze html open and close tags etc.. – fairy Mar 18 '14 at 09:15
-2

I'm not sure that i've understand what you want to do, but if you want to read the HTML page isn't better to do it with jQuery? If you are new with it you can find a good starting guide at http://www.w3schools.com/jquery/default.asp and a good help from API-documentation at http://api.jquery.com/

Furthermore, why you need to use a script for see the HTML code of a website when you can "inspect" it, and if you are interested expand all tag and make a simple ctrl+a/ctrl+c and ctrl+v in a txt file?

Andrea_86
  • 489
  • 5
  • 19
  • I have already saved the html codes in a file. What I want to do know is to check whether all opening tags have their corresponding closing tags. – fairy Mar 17 '14 at 11:34
  • ok, so i didn't uderstand nothing from you're first question. Sorry can't help you with PHP function or something else. – Andrea_86 Mar 17 '14 at 13:13
  • Thanks i just want to know How to analyze html tags(open, close, unsupported, nested) in a website using php – fairy Mar 18 '14 at 11:15
  • Have you tried using the W3C validator? tells you errors (unclosed tags or closing tag unopened), deprecated tag in html5 and other. Try it at http://validator.w3.org/ Using PHP i think there isn't a predefinited function, you must write your own script that chek it – Andrea_86 Mar 18 '14 at 16:19
  • Please help me I'm not getting enough resources for this – fairy Mar 22 '14 at 09:49
  • Can someone help me of what php function may use to extract html unsupported tags and nested tags?? – fairy Mar 22 '14 at 09:55