正则表达式：如何匹配所有非字母字符，无论它们在字符串中的什么位置？

1条回答

网友

1楼 · 发布于 2024-09-30 20:38:17

匹配

所以，从你的问题来看，我相信你在寻找这个

M.*?y.*?M.*?o.*?m.*?s.*?h.*?o.*?u.*?s.*?e

或

M[^a-zA-Z]*?y[^a-zA-Z]*?M[^a-zA-Z]*?o[^a-zA-Z]*?m[^a-zA-Z]*?s[^a-zA-Z]*?h[^a-zA-Z]*?o[^a-zA-Z]*?u[^a-zA-Z]*?s[^a-zA-Z]*?e

第一个匹配搜索字符串加上搜索字符串字符之间的任何字符（如问题正文中所述，请参见regex101），第二个匹配非字母字符（如问题标题所示，请参见regex101）

其中每一个都是从搜索字符串的字符构建的，其模式是惰性地匹配任何字符（案例1）或任何非字母字符（案例2）

注意：如果您希望第二个也排除“特殊”字字符，例如é、ü或ô，则需要在您使用的正则表达式模式中相应地处理它们，例如通过使用unicode类别\P{L}

M\P{L}*?y\P{L}*?M\P{L}*?o\P{L}*?m\P{L}*?s\P{L}*?h\P{L}*?o\P{L}*?u\P{L}*?s\P{L}*?e

\p{L}匹配类别“字母”中的单个代码点，\P{L}匹配相反的代码点（请参见regex101）

构建表达式

无论您的确切表达式是什么，您都可以通过将搜索字符串的每个字符与您选择的表达式相连接来轻松构建最终的正则表达式字符串

Python示例

下面是一个python示例（因为您的问题没有使用编程语言标记）：

import regex

text = ["text 123 ->My Mom's house<- jidjio", 
        "bla bla ->My8Mo2ms231#43house<- bla bla", 
        "Test string ->My Mom's' house<- further text", 
        "wkashhasMdykMomLsfheoousssswQseBswenksd", 
        "textMy?M?om*s?*hou?*seorsomethingelse",
        "thisIs3MôyMäoméshouseEFSAcasw!"]

search_string = "MyMomshouse"

regex_string = r'.*?'.join(str(c) for c in search_string)
regex_string2 = r'[^a-zA-Z]*?'.join(str(c) for c in search_string)
regex_string3 = r'\P{L}*?'.join(str(c) for c in search_string)

print('\n - regex 1  -')
for t in text:
    print(regex.search(regex_string, t))

print('\n - regex 2  -')
for t in text:
    print(regex.search(regex_string2, t))

print('\n - regex 3  -')
for t in text:
    print(regex.search(regex_string3, t))

输出：


 - regex 1  -
<regex.Match object; span=(11, 25), match="My Mom's house">
<regex.Match object; span=(10, 29), match='My8Mo2ms231#43house'>
<regex.Match object; span=(14, 29), match="My Mom's' house">
<regex.Match object; span=(8, 31), match='MdykMomLsfheoousssswQse'>
<regex.Match object; span=(4, 22), match='My?M?om*s?*hou?*se'>
<regex.Match object; span=(7, 21), match='MôyMäoméshouse'>

 - regex 2  -
<regex.Match object; span=(11, 25), match="My Mom's house">
<regex.Match object; span=(10, 29), match='My8Mo2ms231#43house'>
<regex.Match object; span=(14, 29), match="My Mom's' house">
None
<regex.Match object; span=(4, 22), match='My?M?om*s?*hou?*se'>
<regex.Match object; span=(7, 21), match='MôyMäoméshouse'>

 - regex 3  -
<regex.Match object; span=(11, 25), match="My Mom's house">
<regex.Match object; span=(10, 29), match='My8Mo2ms231#43house'>
<regex.Match object; span=(14, 29), match="My Mom's' house">
None
<regex.Match object; span=(4, 22), match='My?M?om*s?*hou?*se'>
None

注:

我使用python regex模块而不是re模块，因为它支持\p{L}模式
如果搜索字符串包含在正则表达式中具有特殊含义的字符，则在构建模式时需要对其进行转义，例如'.*?'.join(regex.escape(str(c)) for c in search_string)
我使用了搜索字符串MyMomshouse（无空格），而不是您指定的字符串，因为您的字符串与示例字符串中的第二个字符串不匹配

JavaScript示例：

在JavaScript中，或者在原则上，在任何语言中，都可以做到这一点。另见this JS fiddle：

const text = ["text 123 ->My Mom's house<- jidjio", 
        "bla bla ->My8Mo2ms231#43house<- bla bla", 
        "Test string ->My Mom's' house<- further text", 
        "wkashhasMdykMomLsfheoousssswQseBswenksd", 
        "textMy?M?om*s?*hou?*seorsomethingelse",
        "thisIs3MôyMäoméshouseEFSAcasw!"];
      
const search_string = "MyMomshouse";

const regex_string = Array.from(search_string).join('.*?')

console.log(regex_string)

text.forEach((entry) => {
    console.log(entry.search(regex_string));
});

但是，unicode字符组支持并不总是可用的，请参见this SO questions and its answers for possible solutions

匹配

构建表达式

Python示例

JavaScript示例：

相关问题更多 >

编程相关推荐

热门问题

热门文章

正则表达式：如何匹配所有非字母字符，无论它们在字符串中的什么位置？

匹配

构建表达式

Python示例

JavaScript示例：

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >