<p>您还可以使用xslt和自定义xpath函数来完成此操作。在</p>
<p>下面是一个例子。但还需要一些额外的工作来处理文本中的空白,例如,在处理文本时,还需要一些额外的工作。在</p>
<p>鉴于此输入:</p>
<pre><code>
<html>
<head>
</head>
<body>
<p>here is some text to bold</p>
<p>and some more</p>
</body>
</html>
</code></pre>
<p>词汇表包含两个单词:<b>some,bold</b></p>
<p>则示例输出为:</p>
^{pr2}$
<p>这是代码,我也贴在了<a href="http://bkc.pastebin.com/f545a8e1d" rel="nofollow noreferrer">http://bkc.pastebin.com/f545a8e1d</a></p>
<pre>
<code>
from lxml import etree
stylesheet = etree.XML("""
<xsl:stylesheet version="1.0"
xmlns:btest="uri:bolder"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="@*">
<xsl:copy />
</xsl:template>
<xsl:template match="*">
<xsl:element name="{name(.)}">
<xsl:copy-of select="@*" />
<xsl:apply-templates select="text()" />
<xsl:apply-templates select="./*" />
</xsl:element>
</xsl:template>
<xsl:template match="text()">
<xsl:copy-of select="btest:bolder(.)/node()" />
</xsl:template>
</xsl:stylesheet>
""")
glossary = ['some', 'bold']
def bolder(context, s):
results = []
r = None
for word in s[0].split():
if word in glossary:
if r is not None:
results.append(r)
r = etree.Element('r')
b = etree.SubElement(r, 'b')
b.text = word
b.tail = ' '
results.append(r)
r = None
else:
if r is None:
r = etree.Element('r')
r.text = '%s%s ' % (r.text or '', word)
if r is not None:
results.append(r)
return results
def test():
ns = etree.FunctionNamespace('uri:bolder') # register global namespace
ns['bolder'] = bolder # define function in new global namespace
transform = etree.XSLT(stylesheet)
print str(transform(etree.XML("""<html><head></head><body><p>here is some text to bold</p><p>and some more</p></body></html>""")))
if __name__ == "__main__":
test()
</code>
</pre>