<p>迭代html文件内容并打印匹配的行。这里我用列表l替换了文件内容:</p>
<pre><code>>>> l = ['<span class="new"> <a class="blog" href="http://whatever1.com" rel="nofollow">whatever1</a> do something at <a class="others" href="http://example1.com" rel="nofollow">example1</a></span>',
'<span class="new"> <a class="blog" href="http://whatever2.com" rel="nofollow">whatever2</a> do other things at <a class="others" href="http://example2.com" rel="nofollow">example2</a></span>',
'<span class="new"> <a class="blog" href="http://whatever3.com" rel="nofollow">whatever3</a> do something at <a class="others" href="http://example3.com" rel="nofollow">example3</a></span>' ]
>>> for i in range(len(l)):
if re.search('<span class="new">.*do something.*', l[i]):
print l[i]
<span class="new"> <a class="blog" href="http://whatever1.com" rel="nofollow">whatever1</a> do something at <a class="others" href="http://example1.com" rel="nofollow">example1</a></span>
<span class="new"> <a class="blog" href="http://whatever3.com" rel="nofollow">whatever3</a> do something at <a class="others" href="http://example3.com" rel="nofollow">example3</a></span>
>>>
</code></pre>