把句子分成以capi开头的单独的字符串

2024-09-27 21:32:57 发布

您现在位置:Python中文网/ 问答频道 /正文

基本上,我想把下面的字符串分成两个单独的字符串,这样:

输入: “利普修斯,A.停飞至3b(1-2 FBF);阿蒙斯前进至第二。莫贝格挥棒三振(2-2 BSSFBS)

输出: [“利普修斯,A.停飞至3b(1-2 FBF);阿蒙斯前进至第二。”, “莫贝格挥杆出局(2-2个BSSFBS)。”

新的句子总是以大写字母开头(即玩家的名字)。以下是我尝试编写的代码:

import re

string = 'LIPCIUS, A. grounded out to 3b (1-2 FBF); AMMONS advanced to second. MOBERG struck out swinging (2-2 BSSFBS).'
x = re.findall("[A-Z].*?[\.!?]", string, re.DOTALL)
print(x)

我的代码当前输出以下内容,列表中的第一个字符串不准确:

['LIPCIUS, A.', 'FBF); AMMONS advanced to second.', 'MOBERG struck out swinging (2-2 BSSFBS).']
it should be ['LIPCIUS, A. grounded out to 3b (1-2 FBF); AMMONS advanced to second.','MOBERG struck out swinging (2-2 BSSFBS).']

Tags: to字符串代码restringoutadvancedsecond
2条回答

下面的Regex应该适合您,添加了可选的大写字母lookahead assertion或结尾$,后跟.,以避免停在A.B.

import re
string = 'LIPCIUS, A. grounded out to 3b (1-2 FBF); AMMONS advanced to second. MOBERG struck out swinging (2-2 BSSFBS).'
x = re.findall("[A-Z].*?[\.!?]\s?(?=[A-Z]|$)", string, re.DOTALL)
# ['LIPCIUS, A. grounded out to 3b (1-2 FBF); AMMONS advanced to second. ', 'MOBERG struck out swinging (2-2 BSSFBS).']
import re
s = 'LIPCIUS, A. grounded out to 3b (1-2 FBF); AMMONS advanced to second. MOBERG struck out swinging (2-2 BSSFBS).'
l = re.split(r'[.][ ](?=[A-Z]+\b)', s)
print l

它只是不包括每个想要的输出项目后的点,但我猜它不会打扰你。你知道吗

相关问题 更多 >

    热门问题