我目前正试图弄清楚如何在健身班网站上浏览一组工作室
在这个网站的搜索结果页面上,每个页面列出了50个工作室,大约有26个页面https://classpass.com/search如果你想看一看
我的代码解析搜索结果页面,selenium获取页面上每个studio的链接(在我的完整代码中,selenium打开链接并在页面上刮取数据)
在循环浏览了第1页上的所有结果之后,我想单击下一页按钮,并在结果第2页上重复。我得到错误 Message: no such element: Unable to locate element:
,但我知道元素肯定在结果页面上,可以单击。我用一个简化的脚本对此进行了测试以确认
我可能做错了什么?我试过很多建议,但到目前为止没有一个有效
from selenium import webdriver
from bs4 import BeautifulSoup as soup
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait as browser_wait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.options import Options
import time
import re
import csv
# initialize the chrome browser
browser = webdriver.Chrome(executable_path=r'./chromedriver')
# URL
class_pass_url = 'https://www.classpass.com'
# Create file and writes the first row, added encoding type as write was giving errors
#f = open('ClassPass.csv', 'w', encoding='utf-8')
#headers = 'URL, Studio, Class Name, Description, Image, Address, Phone, Website, instagram, facebook, twitter\n'
#f.write(headers)
# classpass results page
page = "https://classpass.com/search"
browser.get(page)
# Browser waits
browser_wait(browser, 10).until(EC.visibility_of_element_located((By.CLASS_NAME, "line")))
# Scrolls to bottom of page to reveal all classes
# browser.execute_script("window.scrollTo(0, document.body.scrollHeight);")
# Extract page source and parse
search_source = browser.page_source
search_soup = soup(search_source, "html.parser")
pageCounter = 0
maxpagecount = 27
# Looks through results and gets link to class page
studios = search_soup.findAll('li', {'class': '_3vk1F9nlSJQIGcIG420bsK'})
while (pageCounter < maxpagecount):
search_source = browser.page_source
search_soup = soup(search_source, "html.parser")
studios = search_soup.findAll('li', {'class': '_3vk1F9nlSJQIGcIG420bsK'})
for studio in studios:
studio_link = class_pass_url + studio.a['href']
browser.get(studio_link)
browser_wait(browser, 10).until(EC.visibility_of_element_located((By.CLASS_NAME, "line")))
element = browser.find_element_by_xpath('//*[@id="Search_Results"]/div[1]/div/div/nav/button[2]')
browser.execute_script("arguments[0].click();", element)
在找到“下一页”按钮之前,必须返回主页面。您可以通过替换以下代码来解决此问题。此代码最初将收集页面的所有studio url
到
然后单击下一页按钮元素删除代码
相关问题 更多 >
编程相关推荐