我需要创建一个标记器,它将用逗号分割字符串
使用split可以做到这一点
re.split(',+', str)
但是我需要使用compile。 我试过了
text = "5g, dynamic vision sensor (dvs), 3-d reconstruction, neuromorphic engineering, neural networks, humanoid robots, neuromorphics, closed loop systems, field programmable gate arrays, spiking motor controller, neuromorphic implementation, icub, relation neural network"
pattern = re.compile(r'[a-z0-9\(\)-]+')
re.findall(pattern, text)
输出是
['5g', 'dynamic', 'vision', 'sensor', '(dvs)', '3-d', 'reconstruction', 'neuromorphic', 'engineering', 'neural', 'networks', 'humanoid', 'robots', 'neuromorphics', 'closed', 'loop', 'systems', 'field', 'programmable', 'gate', 'arrays', 'spiking', 'motor', 'controller', 'neuromorphic', 'implementation', 'icub', 'relation', 'neural', 'network']
期望输出为
['5g', 'dynamic vision sensor (dvs)', '3-d reconstruction', 'neuromorphic engineering', 'neural networks', 'humanoid robots', 'neuromorphics', 'closed loop systems', 'field programmable gate arrays', 'spiking motor controller', 'neuromorphic implementation', 'icub', 'relation neural network']
不要为此使用正则表达式。只需使用python的内置函数
split()
尝试以下模式:
[a-z0-9() -]+(?=,|$)
代码:
输出:
正如@mama所说,您不需要为此使用regex,但如果您特别想使用re.compile,可以使用以下代码:
相关问题 更多 >
编程相关推荐