Python+机械化异步任务

2024-10-01 04:49:25 发布

男 | 程序猿一只，喜欢编程写python代码。

所以我有这段python代码，它运行在一个美味的页面上，并从中获取一些链接。extract方法包含一些提取所需内容的魔力。然而，运行一个接一个的页面获取是相当慢的——有没有一种方法可以在python中实现这种异步，这样我就可以并行地启动几个get请求和处理页面了？在

url= "http://www.delicious.com/search?p=varun"
page = br.open(url)
html = page.read()
soup = BeautifulSoup(html)
extract(soup)

count=1
#Follows regexp match onto consecutive pages
while soup.find ('a', attrs={'class': 'pn next'}):
    print "yay"
    print count
    endOfPage = "false"
    try :
        page3 = br.follow_link(text_regex="Next")
        html3 = page3.read()
        soup3 = BeautifulSoup(html3)
        extract(soup3)
    except:
        print "End of Pages"
        endOfPage = "true"
    if valval == "true":
        break
    count = count +1

Tags：方法 br url read html count page extract

1条回答

网友

1楼 · 发布于 2024-10-01 04:49:25

BeautifulSoup是相当慢的，如果您想要更好的性能，请使用lxml代替，或者如果您有许多CPU，也许您可以尝试对队列使用多处理。在

Python+机械化异步任务

相关问题更多 >

编程相关推荐

热门问题

热门文章

Python+机械化异步任务

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >