BeautifulSoup尝试从IMDB中获取选定值,但获取错误

2024-09-28 17:20:21 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试使用BeautifulSoup从以下HTML获取所选值,但无法

<select id="bySeason" tconst="tt0944947" class="current">
  <!--
  This ensures that we don't wind up accidentally marking two options
  (Unknown and the blank one) as selected.
  -->
  <option value="1">
    1
  </option>
  <!--
  This ensures that we don't wind up accidentally marking two options
  (Unknown and the blank one) as selected.
  -->
  <option selected="selected" value="8">
    2
  </option>
</select>

这是我正在尝试但徒劳的

season_container = page_html.find_all("select", class_="current")
print(season_container.find_all('option', selected=True))

Tags: thatcurrentthisselectclassweoptionwind
2条回答

你很接近

season_container = page_html.find_all("select", class_="current")[0] # <- first ele. 
print(season_container.find_all('option', selected=True))

第一行返回一个数组,因此您必须指定以选择(大概)第一个元素。 代码的另一部分很好

您可以通过使用id来缩小搜索范围


from bs4 import BeautifulSoup

html = """<select id="bySeason" tconst="tt0944947" class="current">
  <! 
  This ensures that we don't wind up accidentally marking two options
  (Unknown and the blank one) as selected.
   >
  <option value="1">
    1
  </option>
  <! 
  This ensures that we don't wind up accidentally marking two options
  (Unknown and the blank one) as selected.
   >
  <option selected="selected" value="8">
    2
  </option>
</select>
"""

soup = BeautifulSoup(html, "html.parser")
selected_value = soup.find("select", {"id":"bySeason"}).find("option",selected=True)

print(selected_value.get_text(strip=True))
print("   -")
print(selected_value["value"])

输出:

2
   -
8

相关问题 更多 >