<p>下面是一个使用pyparsing更符合习惯用法的另一个答案。因为它提供了一个详细的语法来说明可以看到什么输入和应该返回什么结果,所以解析的数据并不是“杂乱无章的”,因此<code>toXML()</code>不需要那么辛苦地工作,也不需要进行任何真正的清理。在</p>
<pre><code>print "\n----- ORIGINAL -----\n"
dump = """
abc = (
bcd = (efg = 0, ghr = 5, lmn 10),
ghd = 5,
zde = (dfs = 10, fge =20, dfg = (sdf = 3, ert = 5), juh = 0))
""".strip()
print dump
print "\n----- PARSED INTO LIST -----\n"
from pyparsing import Word, alphas, nums, Optional, Forward, delimitedList, Group, Suppress
def Syntax():
"""Define grammar and parser."""
# building blocks
name = Word(alphas)
number = Word(nums)
_equals = Optional(Suppress('='))
_lpar = Suppress('(')
_rpar = Suppress(')')
# larger constructs
expr = Forward()
value = number | Group( _lpar + delimitedList(expr) + _rpar )
expr << name + _equals + value
return expr
parsed = Syntax().parseString(dump)
print parsed
print "\n----- SERIALIZED INTO XML ----\n"
def toXML(part, level=0):
xml = ""
indent = " " * level
while part:
tag = part.pop(0)
payload = part.pop(0)
insides = payload if isinstance(payload, str) \
else "\n" + toXML(payload, level+1) + indent
xml += "{indent}<{tag}>{insides}</{tag}>\n".format(**locals())
return xml
print toXML(parsed)
</code></pre>
<p>输入和XML输出与我的另一个答案相同。<code>parseString()</code>返回的数据是唯一真正的更改:</p>
^{pr2}$