识别除某些符号外的正则表达式模式

2024-06-26 01:39:21 发布

您现在位置:Python中文网/ 问答频道 /正文

我目前正在使用下面的正则表达式模式删除句子中的符号

sentence = re.sub("[^a-zA-Z]"," ", sentence)

但是,我想保留所有的-,删除句子中的所有其他符号

例如,在下面提到的句子中,我想得到如下输出

Input: tim-tam is a tasty, yummy chocolate.
Output: tim-tam is a tasty yummy chocolate

如何改进当前的regex模式来实现这一点


Tags: reinputoutputis模式符号sentencetasty
3条回答

[^a-zA-Z-]

除了a-z,a-z或者破折号

regular expression syntax

[^a-zA-Z]表示匹配不在a-z或a-z范围内的任何字符

Characters that are not within a range can be matched by complementing the set. If the first character of the set is '^', all the characters that are not in the set will be matched. For example, [^5] will match any character except '5', and [^^] will match any character except '^'. ^ has no special meaning if it’s not the first character in the set.

如果您还想排除-,请包括它:[^a-zA-Z-]

如果这是你现在的正则表达式

sentence = re.sub("[^a-zA-Z]"," ", sentence)

如果要排除-,请使用

sentence = re.sub("[^a-zA-Z-]"," ", sentence)

[]开头的插入符号表示“不在此字符类中”。因此,将-添加到集合中会将其排除在匹配之外

相关问题 更多 >