"删除使用撇号分隔字符串后的空格"

2024-10-04 05:26:34 发布

您现在位置:Python中文网/ 问答频道 /正文

我想用“不能”或“不”的字眼删除空白,无论是通过regex还是在删除时

from nltk.tokenize import WordPunctTokenizer
tok = WordPunctTokenizer()
detok = MosesDetokenizer()

pattern= "[^\w ]+ "
text= "i can ' t use this cause they won ' t fit"
string= re.sub(pattern, '', text)
tk = tok.tokenize(string)
output= detok.detokenize(tk, return_str = True)
print(output)

 "i can 't use this cause they won' t fit"

关于如何在“can”和“won”之后删除空白的任何想法,我都不能也不会。当我使用^{{cd1>}去破坏时,我得到了双空格,一个在撇号前后。示例^{cd2>}


Tags: textstringusethiscan空白fitpattern
2条回答

我认为你可以简单地做一些事情,比如:

output = "i can 't use this cause they won' t fit"
output = output.replace(" '", "")
print output
"i can't use this cause they won't fit"

@BenT我不能说regex,但可以在输出上应用以下操作:

output = "i can 't use this cause they won' t fit"
output = "'".join(output.split(" '"))
output = "'".join(output.split("' "))
print(output)
"i can't use this cause they won't fit"

也有一线解决方案:

^{pr2}$

相关问题 更多 >