擅长:python、mysql、java
<pre><code>from bs4 import BeautifulSoup
htmltxt = """<div class="what-im-after">
<p>
"content I want"
</p>
<p>
"content I want"
</p>
<p>
"content I want"
</p>
<div class='not-what-im-after">
<p>
"content I don't want"
</p>
</div>
<p>
"content I want"
</p><p>
"content I want"
</p>
</div>"""
soup = BeautifulSoup(htmltxt, 'lxml')
def filter_p(container):
items = container.contents
ans = []
for item in items:
if item.name == 'p':
ans.append(item)
return ans
print(filter_p(soup.div))
</code></pre>
<p>也许你想要这个。
我只过滤div的第一级p子级</p>