使用BeautifulSoup进行SRE匹配时无法访问元素

2024-06-02 11:13:02 发布

您现在位置:Python中文网/ 问答频道 /正文

我这样刮页面:

s1 =bs4DerivativePage.find_all('table',class_='not-clickable zebra’) 

输出:

[<table class="not-clickable zebra" data-price-format="{price}" data-quote-detail="0" data-stream-id="723288" data-stream-quote-option="Standard">
 <tbody><tr>
 <td><strong>Stop loss-niveau</strong></td>
 <td>141,80447</td>
 <td class="align-left"><strong>Type</strong></td>
 <td>Turbo's</td>
 </tr>
 <tr>
 <td><strong>Financieringsniveau</strong></td>
 <td>135,05188</td>

我需要从FinancingSniveau那里获取价值。 下面给出了一个匹配:

finNiveau=re.search('Financieringsniveau’,LineIns1)

然而,我需要数值13505188。如何做到这一点


Tags: datastreamtablenot页面pricetrclass
2条回答

假设data-stream-id属性值是唯一的(与table标记结合使用),您可以使用CSS选择器并避免re。这是一种快速检索方法

from bs4 import BeautifulSoup

html = '''
<table class="not-clickable zebra" data-price-format="{price}" data-quote-detail="0" data-stream-id="723288" data-stream-quote-option="Standard">
 <tbody><tr>
 <td><strong>Stop loss-niveau</strong></td>
 <td>141,80447</td>
 <td class="align-left"><strong>Type</strong></td>
 <td>Turbo's</td>
 </tr>
 <tr>
 <td><strong>Financieringsniveau</strong></td>
 <td>135,05188</td>
 '''

soup = BeautifulSoup(html, 'lxml')
print(soup.select_one('table[data-stream-id="723288"] td:nth-of-type(6)').text)

您可以使用.findNext()

Ex:

from bs4 import BeautifulSoup

s = """<table class="not-clickable zebra" data-price-format="{price}" data-quote-detail="0" data-stream-id="723288" data-stream-quote-option="Standard">
 <tbody><tr>
 <td><strong>Stop loss-niveau</strong></td>
 <td>141,80447</td>
 <td class="align-left"><strong>Type</strong></td>
 <td>Turbo's</td>
 </tr>
 <tr>
 <td><strong>Financieringsniveau</strong></td>
 <td>135,05188</td></tr></tbody></table>"""

soup = BeautifulSoup(s, "html.parser")
print(soup.find(text="Financieringsniveau").findNext("td").text)  #Search using text and the use findNext

输出:

135,05188

相关问题 更多 >