使用pyparsing在多行上解析单词escapesplit

from pyparsing import * continued_ending = Literal('\\') + lineEnd word = Word(alphas) split_word = word + Suppress(continued_ending) multi_line_word = Forward() multi_line_word << (word | (split_word + multi_line_word)) print multi_line_word.parseString( '''super\\ cali\\ fragi\\ listic''')

2条回答

网友

1楼 · 编辑于 2024-09-27 18:18:25

在四处探访了一下之后，我发现this help thread那里有一个值得注意的地方

I often see inefficient grammars when someone implements a pyparsing grammar directly from a BNF definition. BNF does not have a concept of "one or more" or "zero or more" or "optional"...

这样，我就有了改变这两条线的想法

multi_line_word = Forward()
multi_line_word << (word | (split_word + multi_line_word))

到

^{2}$

这让它输出了我想要的：['super', 'cali', fragi', 'listic']。在

接下来，我添加了一个parse操作，它将把这些标记连接在一起：

multi_line_word.setParseAction(lambda t: ''.join(t))

这将给出['supercalifragilistic']的最终输出。在

我学到的一条信息是，一个人不仅仅是walk into Mordor。在

开玩笑而已。在

主要的信息是，不能简单地用pyparsing实现BNF的一对一转换。应该调用一些使用迭代类型的技巧。在

编辑2009年11月25日：为了补偿更复杂的测试用例，我将代码修改为以下代码：

no_space = NotAny(White(' \t\r'))
# make sure that the EOL immediately follows the escape backslash
continued_ending = Literal('\\') + no_space + lineEnd
word = Word(alphas)
# make sure that the escape backslash immediately follows the word
split_word = word + NotAny(White()) + Suppress(continued_ending)
multi_line_word = OneOrMore(split_word + NotAny(White())) + Optional(word)
multi_line_word.setParseAction(lambda t: ''.join(t))

这样做的好处是确保任何元素之间没有空格（转义反斜杠后面的换行符除外）。在

网友

2楼 · 编辑于 2024-09-27 18:18:25

你已经很接近你的代码了。这些mod中的任何一个都可以工作：

# '|' means MatchFirst, so you had a left-recursive expression
# reversing the order of the alternatives makes this work
multi_line_word << ((split_word + multi_line_word) | word)

# '^' means Or/MatchLongest, but beware using this inside a Forward
multi_line_word << (word ^ (split_word + multi_line_word))

# an unusual use of delimitedList, but it works
multi_line_word = delimitedList(word, continued_ending)

# in place of your parse action, you can wrap in a Combine
multi_line_word = Combine(delimitedList(word, continued_ending))

正如您在pyparsing google上发现的那样，BNF->；pyparsing翻译应该以一种特殊的方式来使用pyparsing特性来代替BNF，um，缺点。实际上，我正在撰写一个较长的答案，研究更多的BNF翻译问题，但是你已经找到了这个材料（我想是在wiki上）。在

相关问题更多 >

编程相关推荐

热门问题

热门文章