Python多线程循环中的多个请求

2024-10-02 20:35:17 发布

您现在位置:Python中文网/ 问答频道 /正文

我的dataframe df包含600多个URL,我希望从元素中获取特定值。 此代码适用于以下情况:

    ownerlist = []
for links in tqdm (df['Link'], leave=False, position=0):
    ownersite = s.get(links, cookies=cookies)
    owsoup = BeautifulSoup(ownersite.content, 'lxml')
    owner = owsoup.find('input', {'id': 'GlobalBodyContent_InternalBodyContent_BodyContent_Owner'}).get('value')
    ownerlist.append(owner)
    #print(len(ownerlist),owner)
df['Owner'] = ownerlist
print(df)

但完成所有请求需要40分钟。我尝试了多线程方法,但无法使其工作。它运行得更快,但却取代了600多个项目,之后我的列表中只有2或3个项目。我试过:

owner = []
def mt(links):
    ap = s.get(links, cookies=cookies)
    apsoup = BeautifulSoup(ap.content, 'lxml')
    ap1 = apsoup.find('input', {'id': 'GlobalBodyContent_InternalBodyContent_BodyContent_Owner'}).get('value')
    #print(ap1)
    owner.append(ap1)

def main():

    for links in tqdm(df['Link']):
        threadProcess = threading.Thread(name='simplethread', target=mt, args=[links])
        threadProcess.daemon = True
        threadProcess.start()

main() 

如何使循环运行速度超过40分钟?谢谢


Tags: indfforgetlinklinkscookiesprint