4

I have just begun using tidy but i am confused by its functions parseString(), repairString(), cleanRepair(). i have gone through the php.net manual and other sites but can get it? the php manual says that parseString() parses the document stored in a string and repairString() repairs the document stored in a string. But what is the difference between parsing and repairing. the both accept optional parameters and they can be given the same parameters so what is the difference? when to use which function and when both? i have seen in a tutorial,it used both the functions. can somebody help? also point to any useful links if you know. Thanks

lovesh
  • 5,235
  • 9
  • 62
  • 93

1 Answers1

5

parseString takes in a string, and creates a new tidy instance. cleanRepair cleans and repairs the contents of that tidy instance. You can then get the tidied HTML by converting the tidy instance, e.g. by echoing it.

repairString basically does all this in one go. This combination of actions is the most common option, so this is a shortcut method. Notice that it returns a string, whereas parseString returns the new tidy instance and cleanRepair returns a boolean to show whether the operation was successful.


So these are (approximately) equivalent:

$tidy = new Tidy;
$tidy->parseString($yourHTML);
$tidy->cleanRepair();
echo $tidy;

$tidy = new Tidy;
echo $tidy->repairString($yourHTML);
lonesomeday
  • 233,373
  • 50
  • 316
  • 318
  • thanks.But still i do not get the difference between parsing and cleaning? – lovesh May 28 '11 at 22:19
  • 1
    @lovesh Before `tidy` can do its magic on a string, it has to parse it -- scan through and see what's going on, and store this in an internal representation. You can then clean it -- though there are other operations you can do. And yes, you should be able to use `repairString()` as I have in my second example. – lonesomeday May 28 '11 at 22:22
  • can u help me with this [](http://stackoverflow.com/questions/6168558/unable-to-scrape-content-from-a-website) – lovesh May 30 '11 at 16:37