I'm working on a PHP parser that parses my school's HTML 'groups' page. These are pages with a unique URL based on the name of the course and several other variables. The page consists of a bunch of HTML <table>
's.
Loading the HTML from the url works fine up until it comes across a )
in the file's content. Then it just stops loading and only stores what it's gotten so far. Obviously, the HTML loaded was not created by me and there is no way i can prevent such characters from being in the HTML code.
It however works fine when i run it locally using MAMP. I tried looking for answers, but haven't found anything that solved my problem.
How can i escape these characters before loading it?
My current PHP:
$dom = new DOMDocument;
libxml_use_internal_errors(true); // the HTML i parse contains a lot of unclosed tags, this to prevent the errors from displaying on the page
$dom->loadHTMLFile('http://isarog.hhs.nl/Web_Site/HHS/ICTM/Public/Iris_Roster/Timetables/11_2/11_2-CMD-4vt-p2.html');
echo $dom->getElementsByTagName('html')->item(0)->nodeValue;