防止例外情况

2024-06-25 22:32:01 发布

您现在位置:Python中文网/ 问答频道 /正文

目前我有一个脚本可以从Reddit的首页下载头条新闻,而且它几乎总是可以工作的。偶尔我会收到下面的例外。我知道我应该插入try和{}语句来保护我的代码,但是我应该把它们放在哪里呢?在

爬网:

def crawlReddit():                                                     
    r = praw.Reddit(user_agent='challenge')             # PRAW object
    topHeadlines = []                                   # List of headlines 
    for item in r.get_front_page():
        topHeadlines.append(item)                       # Add headlines to list
    return topHeadlines[0].title                            # Return top headline

def main():
    headline = crawlReddit()                            # Pull top headline

if __name__ == "__main__":
    main()              

错误:

^{pr2}$

Tags: 代码脚本maintopdef语句itemreddit
1条回答
网友
1楼 · 发布于 2024-06-25 22:32:01

看起来r.get_front_page()返回一个延迟计算的对象,您只需要该对象中的第一个元素。如果是,请尝试以下操作:

import time

def crawlReddit():                                                     
    r = praw.Reddit(user_agent='challenge')             # PRAW object
    front_page = r.get_front_page()
    try:
        first_headline = front_page.next() # Get the first item from front_page
    except HTTPError:
        return None
    else:
        return first_headline.title


def main():
    max_attempts = 3
    attempts = 1
    headline = crawlReddit()
    while not headline and attempts < max_attempts:
        time.sleep(1)  # Make the program wait a bit before resending request
        headline = crawlReddit()
        attempts += 1
    if not headline:
        print "Request failed after {} attempts".format(max_attempts)


if __name__ == "__main__":
    main()

编辑代码现在最多尝试访问数据3次,每次失败的间隔为1秒。第三次尝试后它就放弃了。服务器可能脱机等

相关问题 更多 >