如何在Python中使用SeleniumWebDriver刮取下拉菜单选项？

3条回答

网友

1楼 · 编辑于 2024-10-04 01:23:54

要select一个特定的option，您可以使用以下内容：

from selenium import webdriver
driver = webdriver.Firefox()

driver.get("some.site")
el = driver.find_element_by_id('Filter_ClientRegion')
for option in el.find_elements_by_tag_name('option'):
    if option.text == 'A': # or  B or C...
        option.click() # select() for older versions
        break

要获取option的values，可以使用：

options = []
driver.get("some.site")
el = driver.find_element_by_id('Filter_ClientRegion')
for option in el.find_elements_by_tag_name('option'):
    options.append(option.get_attribute("value"))
# print(options)
# A B C ...

备注：
1.我无法完全测试上述代码，因为我没有完整的源代码
2.请注意options代码位于注释块<! template bindings={} 内，您可能无法检索其值

网友

2楼 · 编辑于 2024-10-04 01:23:54

这应该很容易

array_options =  []
element = WebDriverWait(self.driver, timeout=wait_time).until(
          EC.visibility_of_element_located("id","Filter_ClientRegion")))
if element.tag_name == 'select':
    select = Select(element)
    dropdown_options = select.options
    for option in dropdown_options:
        array_options.append(option.text)

网友

3楼 · 编辑于 2024-10-04 01:23:54

你可以用BeautifulSoup来做这件事

因为您提到了selenium，所以这段代码首先使用它，以防您需要它通过登录或其他需要selenium的东西。如果您不需要selenium，那么可以跳到使用BeautifulSoup生成soup的行。前面的代码只是展示了如何使用selenium获取源代码，以便BeautifulSoup可以访问它

首先找到包含所有HTML代码的select标记，包括注释的内容。然后获取列表中的每个项目，将其转换为字符串，并将其连接为一个大字符串，并在<select>前加前缀。将这个大字符串转换为soup和findAll其中的option标记。从每个标签中显示您想要的任何内容

>>> from selenium import webdriver
>>> driver = webdriver.Chrome()
>>> content = driver.page_source
>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup(content, 'lxml')
>>> select = soup.find('select', attrs={'id': 'Filter_ClientRegion'})
>>> items = []
>>> for item in select.contents:
...     items.append(str(item).strip())
...     
>>> items
['', '<option _ngcontent-pxo-26="" value="">All</option>', '', 'template bindings={} \n      <option _ngcontent-pxo-26="" value="A">A</option>\n      <option _ngcontent-pxo-26="" value="B">B</option>\n      <option _ngcontent-pxo-26="" value="C">C</option>\n      <option _ngcontent-pxo-26="" value="D">D</option>\n      <option _ngcontent-pxo-26="" value="E">E</option>\n      <option _ngcontent-pxo-26="" value="F">F</option>\n      <option _ngcontent-pxo-26="" value="G">G</option>\n    </select>\n  </div>\n</div>']
>>> newContents = '<select>' + ''.join(items).replace(' ','')
>>> newSelectSoup = BeautifulSoup(newContents)
>>> options = newSelectSoup.findAll('option')
>>> len(options)
8
>>> for option in options:
...     option.attrs['value']
...     
''
'A'
'B'
'C'
'D'
'E'
'F'
'G'

相关问题更多 >

编程相关推荐

热门问题

热门文章