<p>你可以先<a href="https://www.crummy.com/software/BeautifulSoup/bs4/doc/#decompose" rel="nofollow noreferrer">^{<cd1>}</a>中的“br”标记,然后使用<a href="https://www.crummy.com/software/BeautifulSoup/bs4/doc/#css-selectors" rel="nofollow noreferrer">^{<cd2>}</a>方法检索<code>i</code>标记,然后使用<a href="https://www.crummy.com/software/BeautifulSoup/bs4/doc/#next-sibling-and-previous-sibling" rel="nofollow noreferrer">^{<cd4>}</a>获取该标记后面的文本。在</p>
<pre><code>In [81]: from bs4 import BeautifulSoup as BS
In [82]: html = """<div class="rbody">
...: <div style="color:#ff6666"> </div>
...: <i>objectid: </i> 137000<br/>
...: <i>topoid: </i> 504514394<br/>
...: <i>poigroup: </i> Hydrography<br/>
...: <i>poitype: </i> Manmade Waterbody<br/>
...: <i>poiname: </i> FOUR CORNERS DAM<br/>
...: <i>poilabel: </i> FOUR CORNERS DAM<br/>
...: <i>poilabeltype: </i> NAMED<br/>
...: <i>poialtlabel: </i> <br/>
...: <i>Point:</i><br/>
...: <i>X: </i> 1.5778346701624997E7 <br/>
...: <i>Y: </i> -3861557.6243750006 <br/>
...: <br/><br/>
...: </div>"""
In [83]: soup = BS(html, "html.parser")
In [84]: for br in soup.select(".rbody > br"):
...: br.decompose()
...:
In [85]: {i.get_text(strip=True).replace(":", ""): i.next_sibling.strip() for i in soup.select(".rbody > i")}
Out[85]:
{'Point': '',
'X': '1.5778346701624997E7',
'Y': '-3861557.6243750006',
'objectid': '137000',
'poialtlabel': '',
'poigroup': 'Hydrography',
'poilabel': 'FOUR CORNERS DAM',
'poilabeltype': 'NAMED',
'poiname': 'FOUR CORNERS DAM',
'poitype': 'Manmade Waterbody',
'topoid': '504514394'}
</code></pre>