0

I am trying to run a simple script on my aws instance. Same script works well on windows 7 and ubuntu (python27). But when i run my scripts on my server the web site redirects me to an error page it says "you must enable js on your browser".

I tried a lot of things until now (user-agent, redirection handler, mechanize ext). I am getting these redirection only with the domain below. All other js enabled websites works well.

Do you have any idea?

import urllib2
req = urllib2.Request("http://www.sahibinden.com/ilan/emlak-konut-satilik-karatepe-emlak-tan-zumrutevler-de-2-plus1-ara-kat-luks-daire-186413632/detay")
response = urllib2.urlopen(req)
the_page = response.read()
print the_page

EDIT: It turns out the web page is blocking my server ip. Thanks for help

konjuge
  • 9
  • 2

1 Answers1

1

There's no error in your code.

You need a js interpreter in it.

urllib2 just gets the raw data and does not interpret the js code in the page.

You can check this: How to interpret JavaScript with Python


Also, it works fine with following code:

import requests
session = requests.Session()
session.get('http://www.sahibinden.com/ilan/emlak-konut-satilik-karatepe-emlak-tan-zumrutevler-de-2-plus1-ara-kat-luks-daire-186413632/detay').content.decode('utf8')

it returns tons of html code like this:

<li class="">\n                            Çamaşır Makinesi</li>\n                    <li class="">\n                            Çamaşır Odası</li>\n                    <li class="selected">\n                            Çelik Kapı</li>\n                    <li class="">\n                            Şofben</li>\n                    <li class="">\n                            Şömine</li>\n                    </ul>\n            <h3>Dış Özellikler</h3>\n                <ul>\n                    <li class="">\n                            Asansör</li>\n                    <li class="">\n                            Engelliye Uygun</li>\n                    <li class="">\n                            Güvenlik</li>\n                    <li class="selected">\n                            Hidrofor</li>\n                    <li class="selected">\n                            Isı Yalıtım</li>\n                    <li class="">\n                            Jeneratör</li>\n                    <li class="selected">\n                            Kablo TV - Uydu</li>\n                    <li class="">\n                            Kapalı Garaj</li>\n                    <li class="">\n                            Kapıcı</li>\n                    <li class="">\n                            Kreş</li>\n                    <li class="">\n                            Otopark</li>\n                    <li class="">\n                            Oyun Parkı</li>\n                    <li class="selected">\n                            Ses Yalıtımı</li>\n                    <li class="">\n                            Siding</li>\n                    <li class="">\n                            Spor Alanı</li>\n                    <li class="selected">\n                            Su Deposu</li>\n                    <li class="">\n                            Tenis Kortu</li>\n                    <li class="">\n                            Yangın Merdiveni</li>\n                    <li class="">\n                            Yüzme Havuzu (Açık)</li>\n                    <li class="">\n                            Yüzme Havuzu (Kapalı)</li>\n                    </ul>\n            <h3>Muhit</h3>\n                <ul>\n                    <li class="selected">\n                            Alışveriş Merkezi</li>\n                    <li class="">\n                            Belediye</li>\n                    <li class="selected">\n                            Cami</li>\n                    <li class="">\n                            Cemevi</li>\n                    <li class="">\n                            Denize Sıfır</li>\n                    <li class="selected">\n                            Eczane</li>\n                    <li class="">\n                            Eğlence Merkezi</li>\n                    <li class="">\n                            Fuar</li>\n                    <li class="selected">\n                            Hastane</li>\n                    <li class="">\n                            Havra</li>\n                    <li class="">\n                            Kilise</li>\n                    <li class="">\n                            Lise</li>\n                    <li class="selected">\n                            Market</li>\n                    <li class="selected">\n                            Park</li>\n                    <li class="">\n                            Polis Merkezi</li>\n                    <li class="selected">\n                            Sağlık Ocağı</li>\n                    <li class="selected">\n                            Semt Pazarı</li>\n                    <li class="">\n                            Spor Salonu</li>\n                    <li class="">\n                            Üniversite</li>\n                    <li class="selected">\n                            İlköğretim</li>\n                    <li class="">\n                            İtfaiye</li>\n                    <li class="">\n                            Şehir Merkezi</li>\n                    </ul>\n            <h3>Ulaşım</h3>\n                <ul>\n                    <li class="">\n                            Anayol</li>\n                    <li class="">\n                            Boğaz Köprüleri</li>\n                    <li class="selected">\n                            Cadde</li>\n                    <li class="">\n                            Deniz Otobüsü</li>\n                    <li class="">\n                            Dolmuş</li>\n                    <li class="selected">\n                            E-5</li>\n                    <li class="">\n                            Havaalanı</li>\n                    <li class="">\n                            Marmaray</li>\n                    <li class="selected">\n                            Metro</li>\n                    <li class="">\n                            Metrobüs</li>\n                    <li class="selected">\n                            Minibüs</li>\n                    <li class="">\n                            Otobüs Durağı</li>\n                    <li class="">\n                            Sahil</li>\n                    <li class="">\n                            TEM</li>\n                    <li class="">\n                            Tramvay</li>\n                    <li class="">\n                            Tren İstasyonu</li>\n                    <li class="">\n                            İskele</li>\n                    </ul>\n            <h3>Manzara</h3>\n                <ul>\n                    <li class="">\n                            Boğaz</li>\n                    <li class="">\n                            Deniz</li>\n                    <li class="">\n                            Doğa</li>\n                    <li class="">\n                            Göl</li>\n                    <li class="selected">\n                            Şehir</li>\n                    </ul>\n            <h3>Konut Tipi</h3>\n                <ul>\n                    <li class="">\n                            Ara Kat Dubleks</li>\n                    <li class="">\n                            Bahçe Dubleksi</li>\n                    <li class="">\n                            Bahçe Katı</li>\n                    <li class="">\n                            Bahçeli</li>\n                    <li class="">\n                            Müstakil Girişli</li>\n                    <li class="">\n                            Tripleks</li>\n                    <li class="">\n                            Çatı Dubleksi</li>\n                    </ul>\n            </div>\n    </div>\n<script type="text/javascript">\n    var bannerZoneId = "101";\n</script>\n\n<div class="uiBox">\n        <div class="uiBoxTitle">\n            <h3>Hadi Taşının!</h3>\n        </div>\n        <div class="uiBoxContainer" id="adHelperBoxMov">\n            <div class="helper">\n                <ul>\n                    <script type="text/javascript">\n                        var classifiedFooterZone9 = "&amp;PAGE_NAME=ilan_detay_zone_9&amp;CATEGORY_ID=16633&amp;PARENT_ID=16623&amp;CATEGORY_LEVEL_0=3518&amp;CATEGORY_LEVEL_1=3613&amp;CATEGORY_LEVEL_2=16623&amp;CATEGORY_LEVEL_3=16633&amp;CATEGORY_LEVEL_4=0&amp;CATEGORY_LEVEL_5=0&amp;CATEGORY_LEVEL_6=0&amp;LANGUAGE=tr&amp;CITY_ID=34&amp;DISTRICT_ID=2177&amp;TOWN_ID=446&amp;QUARTER_ID=23171" + cAttributes;\n                        var classifiedFooterZone10 = "&amp;PAGE_NAME=ilan_detay_zone_10&amp;CATEGORY_ID=16633&amp;PARENT_ID=16623&amp;CATEGORY_LEVEL_0=3518&amp;CATEGORY_LEVEL_1=3613&amp;CATEGORY_LEVEL_2=16623&amp;CATEGORY_LEVEL_3=16633&amp;CATEGORY_LEVEL_4=0&amp;CATEGORY_LEVEL_5=0&amp;CATEGORY_LEVEL_6=0&amp;LANGUAGE=tr&amp;CITY_ID=34&amp;DISTRICT_ID=2177&amp;TOWN_ID=446&amp;QUARTER_ID=23171" + cAttributes;\n                        var classifiedFooterZone11 = "&amp;PAGE_NAME=ilan_detay_zone_11&amp;CATEGORY_ID=16633&amp;PARENT_ID=16623&amp;CATEGORY_LEVEL_0=3518&amp;CATEGORY_LEVEL_1=3613&amp;CATEGORY_LEVEL_2=16623&amp;CATEGORY_LEVEL_3=16633&amp;CATEGORY_LEVEL_4=0&amp;CATEGORY_LEVEL_5=0&amp;CATEGORY_LEVEL_6=0&amp;LANGUAGE=tr&amp;CITY_ID=34&amp;DISTRICT_ID=2177&amp;TOWN_ID=446&amp;QUARTER_ID=23171" + cAttributes;\n                        var classifiedFooterZone12 = "&amp;PAGE_NAME=ilan_detay_zone_12&amp;CATEGORY_ID=16633&amp;PARENT_ID=16623&amp;CATEGORY_LEVEL_0=3518&amp;CATEGORY_LEVEL_1=3613&amp;CATEGORY_LEVEL_2=16623&amp;CATEGORY_LEVEL_3=16633&amp;CATEGORY_LEVEL_4=0&amp;CATEGORY_LEVEL_5=0&amp;CATEGORY_LEVEL_6=0&amp;LANGUAGE=tr&amp;CITY_ID=34&amp;DISTRICT_ID=2177&amp;TOWN_ID=446&amp;QUARTER_ID=23171" + cAttributes;\n\n                        getBanner(bannerZoneId, classifiedFooterZone9);\n                        getBanner(bannerZoneId, classifiedFooterZone10);\n                        getBanner(bannerZoneId, classifiedFooterZone11);\n                        getBanner(bannerZoneId, classifiedFooterZone12);\n                    </script>\n                </ul>\n            </div>\n       

You could use geturl() method to determine whether your url is redirected (since the website might really generate the message you got according to your server's ip etc.). If it is really redirected, you can prevent it or do some other things. See How do I prevent Python's urllib(2) from following a redirect

Community
  • 1
  • 1
Xiangru Lian
  • 898
  • 1
  • 9
  • 17
  • i know my code is working well on windows and ubuntu. The problem is same scripts getting a redirection on aws amazon linux server. I am not getting the same response. – konjuge Feb 16 '15 at 19:15
  • @konjuge I cannot duplicate your situation, so I've no solid idea of solving this. Did you try to send exactly the same request on your server and your laptop? You could check the request using developer tools in your firefox or chrome and copy it to your code. You could also try requests package rather than urllib2. – Xiangru Lian Feb 16 '15 at 19:33
  • yes i tried it. I tried requests packege also i got same results again. May be its all about some permission issue on aws server. but i have no idea – konjuge Feb 16 '15 at 19:46
  • @konjuge Did you use geturl() to make sure that your url is redirected? – Xiangru Lian Feb 16 '15 at 19:56
  • interesting when i use geturl it seems that there is no redirection. But the content of the webpage is still different – konjuge Feb 16 '15 at 20:03
  • @konjuge So that means the website generates that according to your server. You may use a proxy server to determine whether the website does this according to your server's ip if that is important to you. – Xiangru Lian Feb 16 '15 at 20:06
  • yeah you are right i create a proxy server and triedon my pc got the same result. The webpage is blocking my server ip address. – konjuge Feb 16 '15 at 20:28