python webscraping:我有一个带有picklist的网站。以及如何提取这些列表中的文本

2024-09-29 07:19:38 发布

您现在位置:Python中文网/ 问答频道 /正文

链接如下: https://www.doximity.com/sign_ups/9e016f85-d589-4cdf-8240-09c356d4434f/edit?sign_up[user_attributes][firstname]=Jian&sign_up[user_attributes][lastname]=Cui

我要把职业和它对应的专业扯进去。 但我的代码只适用于拉职业。你知道吗

import requests, bs4

r = requests.get('https://www.doximity.com/sign_ups/9e016f85-d589-4cdf-8240-09c356d4434f/edit?sign_up[user_attributes][firstname]=Jian&sign_up[user_attributes][lastname]=Cui')
soup = bs4.BeautifulSoup(r.text, 'lxml')
spec = soup.find_all('select')

for sub in spec:
    print (sub.text)

请给我一些建议。你知道吗


Tags: httpscomwwwfirstnameeditattributesupups
1条回答
网友
1楼 · 发布于 2024-09-29 07:19:38

请检查以下代码,如有任何问题请通知我:

from selenium import webdriver
from selenium.webdriver.support.ui import Select
import time

driver = webdriver.Chrome()
url = 'https://www.doximity.com/sign_ups/9e016f85-d589-4cdf-8240-09c356d4434f/edit?sign_up[user_attributes][firstname]=Jian&sign_up[user_attributes][lastname]=Cui'

driver.get(url)
spec = driver.find_element_by_id("sign_up_user_attributes_credential_id")
for sub in spec.find_elements_by_xpath('./option | ./optgroup/option'):
    if sub.get_attribute('value') != '':
        print(sub.text)
    selected_spec = Select(driver.find_element_by_id("sign_up_user_attributes_credential_id"))
    selected_spec.select_by_visible_text(sub.text)
    time.sleep(0.5)
    occup = driver.find_element_by_xpath('//select[@id="sign_up_user_attributes_user_professional_detail_attributes_specialty_id"]')
    for oc in occup.find_elements_by_xpath('./option'):
        if oc.text != '' and oc.get_attribute('value') != '':
            print(oc.text)

相关问题 更多 >