Python Selenium刮取不一致的字段

def init_driver(): options = webdriver.ChromeOptions() options.binary_location = '/usr/bin/google-chrome-stable' options.add_argument('headless') options.add_argument('window-size=1200x600') driver = webdriver.Chrome(chrome_options=options) driver.wait = WebDriverWait(driver, 5) return driver def scrape(driver): #Tymm = year make model All three attributes are in the Header, Parse and separate before insterting to SQL ymm_element = driver.find_elements_by_xpath('//*[@id="compareForm"]/div/div/ul/li/div/div/h3') engine_element = driver.find_elements_by_xpath('//*[@id="compareForm"]/div/div/ul/li/div/div/div[3]/dl[1]/dd[1]') trans_element = driver.find_elements_by_xpath('//*[@id="compareForm"]/div/div/ul/li/div/div/div[3]/dl[1]/dd[2]') milage_element = driver.find_elements_by_xpath('//*[@id="compareForm"]/div/div/ul/li/div/div/div[3]/dl[1]/dd[3]')

1条回答

网友

1楼 · 发布于 2024-09-29 19:24:41

首先，使用xpath可以使用contains，如下所示：

driver.find_elements_by_xpath('//dt[contains(text(),'Engine')]')

它看起来更干净，更容易使用，更坚固。在

第二，阅读xpath跟随同级、前置同级、父级和祖先。它将帮助您构建简洁的xpath定位器：

^{pr2}$

无论您的html元素位于哪个顺序，上面的xpath都可以工作。在

相关问题更多 >

编程相关推荐

热门问题

热门文章