嵌套循环用于漂亮的汤文本

### CREATING LOOP TO GO THROUGH PAGES ### results = [] #variable to store loop results for i in range (4): #goes through 4 pages (0-3) url = 'https://clutch.co/it-services/msp?page={}'.format(i) #passes the number inside range through the {} session = HTMLSession() resp = session.get(url) resp.html.render() #RENDERS INCASE ITS JAVASCRIPT SITE soup = BeautifulSoup(resp.html.html, features='lxml') print(url) #shows what page you are on as it is looping agencies = soup.find_all(class_='company-name') for a in agencies: text = (a.text) results.append(text) print(results)

https://clutch.co/it-services/msp?page=0 https://clutch.co/it-services/msp?page=1 https://clutch.co/it-services/msp?page=2 https://clutch.co/it-services/msp?page=3 ['\nAgency Partner Interactive LLC ', '\nTEAM International ', '\nAstute Technology Management ', '\nWP Tech Support ']

1条回答

网友

1楼 · 发布于 2024-06-28 20:14:39

这是因为将每个条目追加到结果列表的语句不在内部for循环中

试试这个：

### CREATING LOOP TO GO THROUGH PAGES ###

results = [] #variable to store loop results
for i in range (4): #goes through 4 pages (0-3)
    url = 'https://clutch.co/it-services/msp?page={}'.format(i) #passes the number inside range through the {}
    session = HTMLSession() 
    resp = session.get(url)
    resp.html.render() #RENDERS INCASE ITS JAVASCRIPT SITE
    soup = BeautifulSoup(resp.html.html, features='lxml')
    print(url) #shows what page you are on as it is looping
    agencies = soup.find_all(class_='company-name')
    for a in agencies:
        text = (a.text)
        results.append(text)

print(results)

相关问题更多 >

编程相关推荐

热门问题

热门文章