如何绕过HTTP错误403:Forbidden withurllib.请求使用Python 3

import urllib.request url = "http://www.londonstockexchange.com/exchange/prices-and-markets/stocks/indices/ftse-indices.html" infile = urllib.request.urlopen(url) # Open the URL data = infile.read().decode('ISO-8859-1') # Read the content as string decoded with ISO-8859-1 print(data) # Print the data to the screen

2条回答

网友

1楼 · 编辑于 2024-09-26 18:05:23

这可能是由于mod\u安全性。您需要将URL作为浏览器打开，而不是pythonurllib来进行欺骗。在

在这里，我更正了你的代码：

import urllib.request

url = "http://www.londonstockexchange.com/exchange/prices-and-markets/stocks/indices/ftse-indices.html"

# Open the URL as Browser, not as python urllib
page=urllib.request.Request(url,headers={'User-Agent': 'Mozilla/5.0'}) 
infile=urllib.request.urlopen(page).read()
data = infile.decode('ISO-8859-1') # Read the content as string decoded with ISO-8859-1

print(data) # Print the data to the screen

接下来，可以使用BeautifulSoup来刮取HTML。在

网友

2楼 · 编辑于 2024-09-26 18:05:23

看来你的费率是有限的。试着睡一觉然后再试一次。例如：

import urllib
import urllib.request
from time import sleep

LSE_URL = "http://www.londonstockexchange.com/exchange/prices-and-markets/stocks/indices/ftse-indices.html"
WAIT_PERIOD = 15

def stock_data_reader():
    stock_data = get_stock_data()
    while True:
        if not stock_data:
            sleep(WAIT_PERIOD) # sleep for a while until next retry
            stock_data = get_stock_data()                
        else:
            break

    print(stock_data) # do something with stock data



def get_stock_data():
    try:
        infile = urllib.request.urlopen(LSE_URL) # Open the URL
    except urllib.error.HTTPError as http_err:
        print("Error: %s" % http_err)
        return None
    else:
        data = infile.read().decode('ISO-8859-1') # Read the content as string decoded with ISO-8859-1
        return data


stock_data_reader()

相关问题更多 >

编程相关推荐

热门问题

热门文章