如何使用beautifulsoup python查找html“隐藏”输入的属性

2024-09-29 06:35:22 发布

您现在位置:Python中文网/ 问答频道 /正文

我能够获得具有典型数据的元素:

<input id="mbb-offeringID-1" type="checkbox" name="offeringID.1" value="cLtX5MPovjIvCLvcPPMSjATUpzLHNW%2Bk9pGa3%2BTkldS92roGTDDM%2B8BvfXGJP5GWE3DNQLNkJUny6cgknOlP%2F3xEODHJiPzzb3Io7oXbgDNP9cSrVkoaA5JVvTc1IPb%2BEcbB%2BeqhE7BAczO81wFLlD3SbtD55y7hOIa5DYQLzkaI9FHJTuyAphRUriSbCRuS">

 page = requests.get(url, headers=headers)
 soup = BeautifulSoup(page.content, "lxml")
 title = soup.find('input', {'id': 'mbb-offeringID-1'}).get('value')
 print(title)   

如何使用隐藏输入检索值,例如:

 <input type="hidden" name="offeringID.1" value="9Lt1oDtQ%2BIAdndBuUQBzl%2FXSUE8quGoqB41HEfz9IncLO4u3HybZ3EWtylW8vTJ1v3KZOS%2FPQRFGN6L0a0pjYFd8KcQ%2Bok3AsTNXxrQUaar1gXa7EHhACX2c%2Bh72E3izLUOwM4q6Wxw%3D">

这是完整的html

<div class="a-fixed-right-grid-col aod-atc-column a-col-right" style="width:150px;margin-right:-150px;float:left;">

   <form method="post" action="/gp/add-to-cart/html/ref=aod_dpdsk_new_0" class="aod-atc-form-header-desktop a-spacing-none">

        <input type="hidden" name="session-id" value="146-6039598-0678601">
          
        <input type="hidden" name="offeringID.1" value="9Lt1oDtQ%2BIAdndBuUQBzl%2FXSUE8quGoqB41HEfz9IncLO4u3HybZ3EWtylW8vTJ1v3KZOS%2FPQRFGN6L0a0pjYFd8KcQ%2Bok3AsTNXxrQUaar1gXa7EHhACX2c%2Bh72E3izLUOwM4q6Wxw%3D">

Tags: namerightidinputgettitlevaluetype
2条回答

因为您还想将"name = offeringID.1"作为关键字参数传递给find_all(),而我在回答中没有涉及到这一点,所以我标记为重复。我会在这里发布一个解决方案。您可以添加attrs=参数:

for tag in soup.find_all("input", type="hidden", attrs={"name": "offeringID.1"}):
    print(tag["value"])

编辑:数据通过Ajax从外部加载,您得到的“值”如下:

import requests
from bs4 import BeautifulSoup

headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.93 Safari/537.36",
    "referer": "https://www.amazon.ca/gp/aod/ajax/ref=auto_load_aod?asin=B07RF237B1&pc=dp",
}

params = (
    ("asin", "B07RF237B1"),
    ("pc", "dp"),
)

response = requests.get(
    "https://www.amazon.ca/gp/aod/ajax/ref=auto_load_aod",
    headers=headers,
    params=params,
)

soup = BeautifulSoup(response.content, "html.parser")

print(soup.find("input", type="hidden", attrs={"name": "offeringID.1"})["value"])

你能做到的

from bs4 import BeautifulSoup
page = requests.get(url, headers=headers)
soup = BeautifulSoup(page.content, "lxml")
hidden_tags = soup.find_all("input", type="hidden")
for tag in hidden_tags:
      print(tag )

相关问题 更多 >