使用Regex-Python在字符串中提取字符串

网友

1楼 · 编辑于 2024-07-05 14:29:25

这是一个使用正则表达式的版本，但不需要在所有部分上循环两次：

def extract(line):
    _, _, parts = line.strip().partition(' Parse: ')
   return re.split('(?: \|)? \+  ', parts)

line = "Input:Can we book an hotel in Lagos ? Parse: book VB ROOT +  Can MD aux +  we PRP nsubj +  hotel NN dobj | +  an DT det | +  in IN prep | +  Lagos NNP pobj +  ? . punct "
print(extract(line))
>>> ['book VB ROOT', 'Can MD aux', 'we PRP nsubj', 'hotel NN dobj', 'an DT det', 'in IN prep', 'Lagos NNP pobj', '? . punct']

网友

2楼 · 编辑于 2024-07-05 14:29:25

我会用re.split。。你知道吗

>>> s = 'Can we book an hotel in Lagos ? Parse: book VB ROOT  +  Can MD aux  +  we PRP nsubj  +  hotel NN dobj  |   +  an DT det  |   +  in IN prep  |       +  Lagos NNP pobj  +  ? . punct'
>>> re.split(r'\s*\|?\s*\+\s* \s*', s.split('Parse:')[1].strip())
['book VB ROOT', 'Can MD aux', 'we PRP nsubj', 'hotel NN dobj', 'an DT det', 'in IN prep', 'Lagos NNP pobj', '? . punct']

网友

3楼 · 编辑于 2024-07-05 14:29:25

通过使用内置函数和方法而不使用regex：

>>> filter(bool, map(str.strip, s.replace('+ ', '|').split('Parse:')[1].split('|')))
['book VB ROOT', 'Can MD aux', 'we PRP nsubj', 'hotel NN dobj', 'an DT det', 'in IN prep', 'Lagos NNP pobj', '? . punct']

相关问题更多 >

编程相关推荐

热门问题

热门文章

使用Regex-Python在字符串中提取字符串

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >