I don't people to use script to fetch all the contents of my site easily. Now if I use php curl I can get all the text and data in my site. But I have seen some sites that return only garbage text. For example, this Chinese site: 'www.jjwxc.net/onebook.php?novelid=6971&chapterid=6' if I use the following php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
$headers = array();
$headers[] = "Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png, image/gif, image/x-bitmap, image/jpeg, image/pjpeg, *;q=0.5";
$headers[] = "Cache-Control: max-age=0";
$headers[] = "Connection: keep-alive";
$headers[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$headers[] = "Accept-Language: en-us,en;q=0.5";
$headers[] = "Pragma: ";
$headers[] = 'Content-type: application/x-www-form-urlencoded;charset=UTF-8';
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_ENCODING, '');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 4);
curl_setopt($ch, CURLOPT_TIMEOUT, 8);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.12) Gecko/2009070611 Firefox/3.0.12");
$data = curl_exec($ch);
curl_close($ch);
echo $data;
I can only get garbage text. But using browsers even with JavaScript disabled, I can view all the correct characters. Any idea how they make it? Thanks!