Python urllib2重定向问题

2024-09-30 04:28:55 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试在我的aws实例上运行一个简单的脚本。同样的脚本在windows7和ubuntu(python27)上运行良好。但当我在服务器上运行脚本时,网站会将我重定向到一个错误页面,上面写着“必须在浏览器上启用js”。你知道吗

到目前为止,我尝试了很多方法(用户代理、重定向处理程序、mechanize ext)。我得到这些重定向只与下面的域。所有其他支持js的网站都运行良好。你知道吗

你知道吗?你知道吗

import urllib2
req = urllib2.Request("http://www.sahibinden.com/ilan/emlak-konut-satilik-karatepe-emlak-tan-zumrutevler-de-2-plus1-ara-kat-luks-daire-186413632/detay")
response = urllib2.urlopen(req)
the_page = response.read()
print the_page

编辑:原来是网页阻塞了我的服务器ip。谢谢你的帮助


Tags: the实例服务器脚本aws网站responseubuntu
1条回答
网友
1楼 · 发布于 2024-09-30 04:28:55

你的代码没有错误。你知道吗

你需要一个js解释器。你知道吗

urllib2只获取原始数据,不解释页面中的js代码。你知道吗

您可以检查:How to interpret JavaScript with Python


此外,它还可以与以下代码配合使用:

import requests
session = requests.Session()
session.get('http://www.sahibinden.com/ilan/emlak-konut-satilik-karatepe-emlak-tan-zumrutevler-de-2-plus1-ara-kat-luks-daire-186413632/detay').content.decode('utf8')

它返回大量html代码,如下所示:

<li class="">\n                            Çamaşır Makinesi</li>\n                    <li class="">\n                            Çamaşır Odası</li>\n                    <li class="selected">\n                            Çelik Kapı</li>\n                    <li class="">\n                            Şofben</li>\n                    <li class="">\n                            Şömine</li>\n                    </ul>\n            <h3>Dış Özellikler</h3>\n                <ul>\n                    <li class="">\n                            Asansör</li>\n                    <li class="">\n                            Engelliye Uygun</li>\n                    <li class="">\n                            Güvenlik</li>\n                    <li class="selected">\n                            Hidrofor</li>\n                    <li class="selected">\n                            Isı Yalıtım</li>\n                    <li class="">\n                            Jeneratör</li>\n                    <li class="selected">\n                            Kablo TV - Uydu</li>\n                    <li class="">\n                            Kapalı Garaj</li>\n                    <li class="">\n                            Kapıcı</li>\n                    <li class="">\n                            Kreş</li>\n                    <li class="">\n                            Otopark</li>\n                    <li class="">\n                            Oyun Parkı</li>\n                    <li class="selected">\n                            Ses Yalıtımı</li>\n                    <li class="">\n                            Siding</li>\n                    <li class="">\n                            Spor Alanı</li>\n                    <li class="selected">\n                            Su Deposu</li>\n                    <li class="">\n                            Tenis Kortu</li>\n                    <li class="">\n                            Yangın Merdiveni</li>\n                    <li class="">\n                            Yüzme Havuzu (Açık)</li>\n                    <li class="">\n                            Yüzme Havuzu (Kapalı)</li>\n                    </ul>\n            <h3>Muhit</h3>\n                <ul>\n                    <li class="selected">\n                            Alışveriş Merkezi</li>\n                    <li class="">\n                            Belediye</li>\n                    <li class="selected">\n                            Cami</li>\n                    <li class="">\n                            Cemevi</li>\n                    <li class="">\n                            Denize Sıfır</li>\n                    <li class="selected">\n                            Eczane</li>\n                    <li class="">\n                            Eğlence Merkezi</li>\n                    <li class="">\n                            Fuar</li>\n                    <li class="selected">\n                            Hastane</li>\n                    <li class="">\n                            Havra</li>\n                    <li class="">\n                            Kilise</li>\n                    <li class="">\n                            Lise</li>\n                    <li class="selected">\n                            Market</li>\n                    <li class="selected">\n                            Park</li>\n                    <li class="">\n                            Polis Merkezi</li>\n                    <li class="selected">\n                            Sağlık Ocağı</li>\n                    <li class="selected">\n                            Semt Pazarı</li>\n                    <li class="">\n                            Spor Salonu</li>\n                    <li class="">\n                            Üniversite</li>\n                    <li class="selected">\n                            İlköğretim</li>\n                    <li class="">\n                            İtfaiye</li>\n                    <li class="">\n                            Şehir Merkezi</li>\n                    </ul>\n            <h3>Ulaşım</h3>\n                <ul>\n                    <li class="">\n                            Anayol</li>\n                    <li class="">\n                            Boğaz Köprüleri</li>\n                    <li class="selected">\n                            Cadde</li>\n                    <li class="">\n                            Deniz Otobüsü</li>\n                    <li class="">\n                            Dolmuş</li>\n                    <li class="selected">\n                            E-5</li>\n                    <li class="">\n                            Havaalanı</li>\n                    <li class="">\n                            Marmaray</li>\n                    <li class="selected">\n                            Metro</li>\n                    <li class="">\n                            Metrobüs</li>\n                    <li class="selected">\n                            Minibüs</li>\n                    <li class="">\n                            Otobüs Durağı</li>\n                    <li class="">\n                            Sahil</li>\n                    <li class="">\n                            TEM</li>\n                    <li class="">\n                            Tramvay</li>\n                    <li class="">\n                            Tren İstasyonu</li>\n                    <li class="">\n                            İskele</li>\n                    </ul>\n            <h3>Manzara</h3>\n                <ul>\n                    <li class="">\n                            Boğaz</li>\n                    <li class="">\n                            Deniz</li>\n                    <li class="">\n                            Doğa</li>\n                    <li class="">\n                            Göl</li>\n                    <li class="selected">\n                            Şehir</li>\n                    </ul>\n            <h3>Konut Tipi</h3>\n                <ul>\n                    <li class="">\n                            Ara Kat Dubleks</li>\n                    <li class="">\n                            Bahçe Dubleksi</li>\n                    <li class="">\n                            Bahçe Katı</li>\n                    <li class="">\n                            Bahçeli</li>\n                    <li class="">\n                            Müstakil Girişli</li>\n                    <li class="">\n                            Tripleks</li>\n                    <li class="">\n                            Çatı Dubleksi</li>\n                    </ul>\n            </div>\n    </div>\n<script type="text/javascript">\n    var bannerZoneId = "101";\n</script>\n\n<div class="uiBox">\n        <div class="uiBoxTitle">\n            <h3>Hadi Taşının!</h3>\n        </div>\n        <div class="uiBoxContainer" id="adHelperBoxMov">\n            <div class="helper">\n                <ul>\n                    <script type="text/javascript">\n                        var classifiedFooterZone9 = "&amp;PAGE_NAME=ilan_detay_zone_9&amp;CATEGORY_ID=16633&amp;PARENT_ID=16623&amp;CATEGORY_LEVEL_0=3518&amp;CATEGORY_LEVEL_1=3613&amp;CATEGORY_LEVEL_2=16623&amp;CATEGORY_LEVEL_3=16633&amp;CATEGORY_LEVEL_4=0&amp;CATEGORY_LEVEL_5=0&amp;CATEGORY_LEVEL_6=0&amp;LANGUAGE=tr&amp;CITY_ID=34&amp;DISTRICT_ID=2177&amp;TOWN_ID=446&amp;QUARTER_ID=23171" + cAttributes;\n                        var classifiedFooterZone10 = "&amp;PAGE_NAME=ilan_detay_zone_10&amp;CATEGORY_ID=16633&amp;PARENT_ID=16623&amp;CATEGORY_LEVEL_0=3518&amp;CATEGORY_LEVEL_1=3613&amp;CATEGORY_LEVEL_2=16623&amp;CATEGORY_LEVEL_3=16633&amp;CATEGORY_LEVEL_4=0&amp;CATEGORY_LEVEL_5=0&amp;CATEGORY_LEVEL_6=0&amp;LANGUAGE=tr&amp;CITY_ID=34&amp;DISTRICT_ID=2177&amp;TOWN_ID=446&amp;QUARTER_ID=23171" + cAttributes;\n                        var classifiedFooterZone11 = "&amp;PAGE_NAME=ilan_detay_zone_11&amp;CATEGORY_ID=16633&amp;PARENT_ID=16623&amp;CATEGORY_LEVEL_0=3518&amp;CATEGORY_LEVEL_1=3613&amp;CATEGORY_LEVEL_2=16623&amp;CATEGORY_LEVEL_3=16633&amp;CATEGORY_LEVEL_4=0&amp;CATEGORY_LEVEL_5=0&amp;CATEGORY_LEVEL_6=0&amp;LANGUAGE=tr&amp;CITY_ID=34&amp;DISTRICT_ID=2177&amp;TOWN_ID=446&amp;QUARTER_ID=23171" + cAttributes;\n                        var classifiedFooterZone12 = "&amp;PAGE_NAME=ilan_detay_zone_12&amp;CATEGORY_ID=16633&amp;PARENT_ID=16623&amp;CATEGORY_LEVEL_0=3518&amp;CATEGORY_LEVEL_1=3613&amp;CATEGORY_LEVEL_2=16623&amp;CATEGORY_LEVEL_3=16633&amp;CATEGORY_LEVEL_4=0&amp;CATEGORY_LEVEL_5=0&amp;CATEGORY_LEVEL_6=0&amp;LANGUAGE=tr&amp;CITY_ID=34&amp;DISTRICT_ID=2177&amp;TOWN_ID=446&amp;QUARTER_ID=23171" + cAttributes;\n\n                        getBanner(bannerZoneId, classifiedFooterZone9);\n                        getBanner(bannerZoneId, classifiedFooterZone10);\n                        getBanner(bannerZoneId, classifiedFooterZone11);\n                        getBanner(bannerZoneId, classifiedFooterZone12);\n                    </script>\n                </ul>\n            </div>\n       

您可以使用geturl()方法来确定您的url是否被重定向(因为网站可能会根据服务器的ip等生成您收到的消息)。 如果它真的被重定向了,你可以阻止它或者做一些其他的事情。见How do I prevent Python's urllib(2) from following a redirect

相关问题 更多 >

    热门问题