<p>我知道您不是在寻找通用解析器,但是<a href="http://pyparsing.wikispaces.com/" rel="nofollow">pyparsing</a>使这个过程非常简单。您的格式看起来非常类似于我作为最早pyparsing示例之一编写的<a href="http://pyparsing.wikispaces.com/file/view/chemicalFormulas.py/31041705/chemicalFormulas.py" rel="nofollow">chemical formula parser</a>。在</p>
<p>以下是使用pyparsing实现的问题:</p>
<pre><code>from pyparsing import (Suppress,Word,alphas,nums,Combine,Optional,Regex,Group,
OneOrMore)
"""
List item
-the string can consist of multiple, at least one, [a-zA-Z] chars
-the string is followed by zero or more digits: e bes g4 c16
-the string is followed by zero or more ' or , (not combined):
e' bes, f'''2 g,,4
-the string can be substituted by a list of strings, list limiters are <>;
the number comes behind the >, no space allowed
"""
LT,GT = map(Suppress,"<>")
integer = Word(nums).setParseAction(lambda t:int(t[0]))
note = Combine(Word(alphas) + Optional(Word(',') | Word("'")))
# or equivalent using Regex class
# note = Regex(r"[a-zA-Z]+('+|,+)?")
# define the list format of one or more notes within '<>'s
note_list = Group(LT + OneOrMore(note) + GT)
# each item is a note_list or a note, optionally followed by an integer; if
# no integer is given, default to 0
item = (note_list | Group(note)) + Optional(integer, default=0)
# reformat the parsed data as a (number, note_or_note_list) tuple
item.setParseAction(lambda t: (t[1],t[0].asList()) )
source = "c'4 d8 < e' g' >16 fis'4 a,, <g, b'> c''1"
print OneOrMore(item).parseString(source)
</code></pre>
<p>有了这个输出:</p>
^{pr2}$