I am trying to make, in a sense, cache pages so that I can display them later as they used to be (if they are changed or deleted). So I'm pulling a whole page's HTML (craigslist ads) into a database field.
I'm using file_get_contents for the ease and simplicity of what I need it for. There is more to it than this, but this is the basis of what I've done
$page = file_get_contents('http://annapolis.craigslist.org/hea/3652436359.html');
// $page = mysql_real_escape_string($page);
// $page = htmlspecialchars($page);
// $page = htmlentities($page);
mysql_query("INSERT INTO `page` (`html`) VALUES ('$page')");
I have tried every built-in PHP sanitation function I could find.
- mysql_real_escape_string
- htmlspecialchars
- htmlentities
None of these will sanitize the page enough so that it can be entered into a MySQL database and MySQL throws a syntax error every time. I was told by someone to just base64 encode the HTML and enter it, but I would like to be able to search the HTML in the database so that wouldn't work for what I need.
I've tried a variety of different things, such as two functions (one inside the other), but I can't seem to get it to work right.
Any and all help would be greatly appreciated.