<p>Pyparsing的makeHTMLTags表达式将为您提供类似于regex的结果,但具有自动结果名称(如命名组)以及对许多HTML特性的容忍:</p>
<pre><code>>>> from pyparsing import *
>>>
>>> h = """<a href="rtsp://v8.cache2.c.youtube.com/CjgLENy73wIaLwnqnxbpjjoGIRMYE
SARFEIJbXYtZ29vZ2xlSARSB3Jlc3VsdHNgpq6joefRgbhNDA==/0/0/0/video.3gp"><img src="h
ttp://i.ytimg.com/vi/IQY6jukWn-o/default.jpg?w=80&amp;h=60&amp;sigh=izeIwhz4POtP
OOr-jRGrtC4qiFA" alt="video" width="80" height="60" style="border:0;margin:0px;"
/></a>"""
>>>
>>> aTag = makeHTMLTags("A")[0]
>>> result = aTag.parseString(h)
>>> print result.dump()
['A', ['href', 'rtsp://v8.cache2.c.youtube.com/CjgLENy73wIaLwnqnxbpjjoGIRMYESARFEIJbXYtZ29vZ2xlSARSB3Jlc3VsdHNgpq6joefRgbhNDA==/0/0/0/video.3gp'], False]
- empty: False
- href: rtsp://v8.cache2.c.youtube.com/CjgLENy73wIaLwnqnxbpjjoGIRMYESARFEIJbXYtZ29vZ2xlSARSB3Jlc3VsdHNgpq6joefRgbhNDA==/0/0/0/video.3gp
- startA: ['A', ['href', 'rtsp://v8.cache2.c.youtube.com/CjgLENy73wIaLwnqnxbpjjoGIRMYESARFEIJbXYtZ29vZ2xlSARSB3Jlc3VsdHNgpq6joefRgbhNDA==/0/0/0/video.3gp'], False]
- empty: False
- href: rtsp://v8.cache2.c.youtube.com/CjgLENy73wIaLwnqnxbpjjoGIRMYESARFEIJbXYtZ29vZ2xlSARSB3Jlc3VsdHNgpq6joefRgbhNDA==/0/0/0/video.3gp
>>> print result.href
rtsp://v8.cache2.c.youtube.com/CjgLENy73wIaLwnqnxbpjjoGIRMYESARFEIJbXYtZ29vZ2xlSARSB3Jlc3VsdHNgpq6joefRgbhNDA==/0/0/0/video.3gp
</code></pre>
<p>如果您有许多锚定标记,并且只希望这些标记以“.3gp”结尾,请执行以下操作:</p>
^{pr2}$