擅长:python、mysql、java
<p>您可以获取整个<code>h1</code>标记,然后提取任何链接,如下所示:</p>
<pre><code>from bs4 import BeautifulSoup
html = """<h1 class="titleClass" itemprop="name">
Text title here
<a class="titleLink" href="somelink-here.html">
text link here
</a>
</h1>"""
soup = BeautifulSoup(html)
p = soup.find('h1', attrs={'class': 'titleClass'})
p.a.extract()
print p.text.strip()
</code></pre>
<p>这将显示:</p>
^{pr2}$