使用beautifulsoup检索信息

2024-06-01 09:24:49 发布

您现在位置:Python中文网/ 问答频道 /正文

我是beautifulsoup的新手,我想检索标签中的特定元素,但问题是没有办法识别标签

下面是html元素

<div class="tbl_racing_head" > <table class="tblgrey"> <thead> <tr> <th width="65%" class="aln_left"><a name="1"></a>Race 1. THE ZILLAH CUP.<span class="pull-right"><a href="/index.php/en/racing/results?view=full#top" id="back-top">Back to Top</a></span></th> <th width="10%">1365 m</th> <th width="15%">Rating 25-0</th> <th width="10%">12h45</th> </tr> </thead> </table> </div>

我想检索值为1365的th,但找不到获取该值的方法。我猜我必须使用nextsibling或一些父方法,但我遇到了困难。下面是我尝试过的代码

url = 'http://www.mauritiusturfclub.com/index.php/en/racing/results? meeting='+str(race_played)+'-'+str(2012)+'&view=full' source_code = requests.get(url) plain_text = source_code.text soup = BeautifulSoup(plain_text,'html.parser') print('Track '+soup.findAll('th',{'width':'10%'})[3])

我犯了错误,好像不起作用。有人能解释一下是怎么回事吗?谢谢

<div class="tbl_racing_head" > <table class="tblgrey"> <thead> <tr> <th width="65%" class="aln_left"><a name="1"></a>Race 1. THE ZILLAH CUP.<span class="pull-right"><a href="/index.php/en/racing/results?view=full#top" id="back-top">Back to Top</a></span></th> <th width="10%">1365 m</th> <th width="15%">Rating 25-0</th> <th width="10%">12h45</th> </tr> </thead> </table> </div> <table class="tblgrey"> <thead> <tr> <th class="txt_center">Rank</th> <th class="txt_center">#</th> <th class="txt_center">Horse</th> <th class="txt_center">Stable</th> <th class="txt_center">Jockey</th> <th class="txt_center">Time</th> <th class="txt_center">Prize</th> </tr> </thead> <tbody> <tr> <td class="txt_center">1</td> <td class="txt_center">9</td> <td class="txt_left"><a href="/index.php/en/component/mtc_horse_rating_list/?view=horse" >POLE OF COLD</a></td> <td class="txt_left">GUJADHUR</td> <td class="txt_left">V.Sola</td> <td class="txt_center">1m23.80</td> <td class="txt_center">115000</td> </tr> <tr> <td class="txt_center">2</td> <td class="txt_center">8</td> <td class="txt_left"><a href="/index.php/en/component/mtc_horse_rating_list/?view=horse" >ROMAN SPLENDOUR</a></td> <td class="txt_left">R.GUJADHUR</td> <td class="txt_left">J.Bardottier</td> <td class="txt_center">1m23.94</td> <td class="txt_center">38000</td> </tr> <tr> <td class="txt_center">3</td> <td class="txt_center">6</td> <td class="txt_left"><a href="/index.php/en/component/mtc_horse_rating_list/?view=horse" >ADDITION</a></td> <td class="txt_left">MAIGROT</td> <td class="txt_left">R.Hoolash</td> <td class="txt_center">1m24.18</td> <td class="txt_center">20000</td> </tr> <tr> <td class="txt_center">4</td> <td class="txt_center">5</td> <td class="txt_left"><a href="/index.php/en/component/mtc_horse_rating_list/?view=horse" >TANGERINE</a></td> <td class="txt_left">S.RAMDIN</td> <td class="txt_left">N.Marday</td> <td class="txt_center">1m24.68</td> <td class="txt_center">14000</td> </tr> <tr> <td class="txt_center">5</td> <td class="txt_center">3</td> <td class="txt_left"><a href="/index.php/en/component/mtc_horse_rating_list/?view=horse" >JUST OPPOSITE</a></td> <td class="txt_left">ALLET</td> <td class="txt_left">S.Bhundoo</td> <td class="txt_center">1m24.82</td> <td class="txt_center">8000</td> </tr> <tr> <td class="txt_center">6</td> <td class="txt_center">10</td> <td class="txt_left"><a href="/index.php/en/component/mtc_horse_rating_list/?view=horse" >PORT ALBERT</a></td> <td class="txt_left">C.RAMDIN</td> <td class="txt_left">S.Bussunt</td> <td class="txt_center">1m24.87</td> <td class="txt_center">0</td> </tr> <tr> <td class="txt_center">7</td> <td class="txt_center">4</td> <td class="txt_left"><a href="/index.php/en/component/mtc_horse_rating_list/?view=horse" >PACMAN</a></td> <td class="txt_left">S.HENRY</td> <td class="txt_left">B.Bhaugeerothee</td> <td class="txt_center">1m25.01</td> <td class="txt_center">0</td> </tr> <tr> <td class="txt_center">8</td> <td class="txt_center">2</td> <td class="txt_left"><a href="/index.php/en/component/mtc_horse_rating_list/?view=horse" >JUST MODERN</a></td> <td class="txt_left">G.ROUSSET</td> <td class="txt_left">N.Teeha</td> <td class="txt_center">1m25.38</td> <td class="txt_center">0</td> </tr> <tr> <td class="txt_center">9</td> <td class="txt_center">1</td> <td class="txt_left"><a href="/index.php/en/component/mtc_horse_rating_list/?view=horse" >DREAMS COME TRUE</a></td> <td class="txt_left">R.MAINGARD</td> <td class="txt_left">K.Ghunowa</td> <td class="txt_center">1m25.52</td> <td class="txt_center">0</td> </tr> <tr> <td class="txt_center">-</td> <td class="txt_center">7</td> <td class="txt_left"><a href="/index.php/en/component/mtc_horse_rating_list/?view=horse" >CARAMEL KING</a></td> <td class="txt_left">P.MERVEN</td> <td class="txt_left">S.Rama</td> <td class="txt_center">-</td> <td class="txt_center">0</td> </tr> </tbody> </table>

Tags: txtviewindexlefttrclassencomponent
1条回答
网友
1楼 · 发布于 2024-06-01 09:24:49

试试看

s = """<div class="tbl_racing_head" >
        <table class="tblgrey">
            <thead>
            <tr>
              <th width="65%" class="aln_left"><a name="1"></a>Race 1. THE ZILLAH CUP.<span class="pull-right"><a href="/index.php/en/racing/results?view=full#top" id="back-top">Back to Top</a></span></th>
              <th width="10%">1365 m</th>
              <th width="15%">Rating 25-0</th>
              <th width="10%">12h45</th>
            </tr>
            </thead>
        </table>
        </div>"""
from bs4 import BeautifulSoup
soup = BeautifulSoup(s,'html.parser')
for tr in soup.findAll("tr"):
    print(tr.find("th", {'width':'10%'}).text)

相关问题 更多 >