创建列表Lexer/par

网友

1楼 · 编辑于 2024-06-23 19:34:21

试试这个：

keyWordList = ['command1', 'command2', 'command3']
userInput = 'The quick brown command1 fox jumped over command2 the lazy dog command3'
inputList = userInput.split()

def tokenize(userInputList, keyWordList):
    keywords = set(keyWordList)
    tokens, acc = [], []
    for e in userInputList:
        if e in keywords:
            tokens.append(acc)
            tokens.append(e)
            acc = []
        else:
            acc.append(e)
    if acc:
        tokens.append(acc)
    return tokens

tokenize(inputList, keyWordList)
> [['The', 'quick', 'brown'], 'command1', ['fox', 'jumped', 'over'], 'command2', ['the', 'lazy', 'dog'], 'command3']

网友

2楼 · 编辑于 2024-06-23 19:34:21

像这样：

def tokenize(lst, keywords):
    cur = []
    for x in lst:
        if x in keywords:
            yield cur
            yield x
            cur = []
        else:
            cur.append(x)

这将返回一个生成器，因此将调用打包到list。在

网友

3楼 · 编辑于 2024-06-23 19:34:21

使用某些正则表达式很容易做到：

>>> reg = r'(.+?)\s(%s)(?:\s|$)' % '|'.join(keyWordList)
>>> userInput = 'The quick brown command1 fox jumped over command2 the lazy dog command3'
>>> re.findall(reg, userInput)
[('The quick brown', 'command1'), ('fox jumped over', 'command2'), ('the lazy dog', 'command3')]

现在您只需拆分每个元组的第一个元素。在

对于不止一个层次的深度，regex可能不是一个好的答案。在

在这个页面上有一些不错的解析器供您选择：http://wiki.python.org/moin/LanguageParsing

我认为Lepl是个好主意。在

相关问题更多 >

编程相关推荐

热门问题

热门文章

创建列表Lexer/par

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >