擅长:python、mysql、java
<p>您可以使用<a href="https://www.crummy.com/software/BeautifulSoup/bs4/doc/#decompose" rel="noreferrer">^{<cd1>}</a>从文档中完全删除标记,并使用<a href="http://www.crummy.com/software/BeautifulSoup/bs4/doc/#strings-and-stripped-strings" rel="noreferrer">^{<cd2>}</a>生成器检索标记内容。</p>
<pre><code>def clean_me(html):
soup = BeautifulSoup(html)
for s in soup(['script', 'style']):
s.decompose()
return ' '.join(soup.stripped_strings)
</code></pre>
<hr/>
<pre><code>>>> clean_me(testhtml)
'THIS IS AN EXAMPLE I need this text captured And this'
</code></pre>