正则表达式只匹配所有类型括号外的内容

3条回答

网友

1楼 · 编辑于 2024-09-30 14:35:47

这个正则表达式只使用普通的文本括号（），[]，{}
您可以添加自己的，只需复制一个块，粘贴它并更改分隔符
你想要的括号。注意递归组。
在排除列表中添加前导括号。
另外请注意，结尾处有一个下降通道[\S\s]，用于拾取任何迷路的物体

更新添加了所有括号类型（来自评论）

/(?:[^$\[{〈【〔（［]+|(?:(\((?>[^()]++|(?1))*$)|({(?>[^{}]++|(?2))*})|(\[(?>[^\[\]]++|(?3))*\])|(〈(?>[^〈〉]++|(?4))*〉)|(【(?>[^【】]++|(?5))*】)|(〔(?>[^〔〕]++|(?6))*〕)|(（(?>[^（）]++|(?7))*）)|(［(?>[^［］]++|(?8))*］))(*SKIP)(*FAIL)|[\S\s])/
https://regex101.com/r/LUXJVu/1

 (?:
    [^\(\[{〈【〔（［]+ 
  | 
    (?:
       (                   # (1 start), Left/Right parenthesis
          \(    
          (?>
             [^()]++ 
           | (?1) 
          )*
          \)                     
       )                   # (1 end)
     | 
       (                   # (2 start), Left/Right curly bracket
          {
          (?>
             [^{}]++ 
           | (?2) 
          )*
          }
       )                   # (2 end)
     | 
       (                   # (3 start), Left/Right square bracket
          \[ 
          (?>
             [^\[\]]++ 
           | (?3) 
          )*
          \] 
       )                   # (3 end)
     | 
       (                   # (4 start), Left/Right angle bracket
          〈
          (?>
             [^〈〉]++ 
           | (?4) 
          )*
          〉
       )                   # (4 end)
     | 
       (                   # (5 start), Left/Right black lenticular bracket
          【
          (?>
             [^【】]++ 
           | (?5) 
          )*
          】
       )                   # (5 end)
     | 
       (                   # (6 start), Left/Right tortoise bracket
          〔
          (?>
             [^〔〕]++ 
           | (?6) 
          )*
          〕
       )                   # (6 end)
     | 
       (                   # (7 start), Left/Right fullwidth parenthesis
          （
          (?>
             [^（）]++ 
           | (?7) 
          )*
          ）
       )                   # (7 end)
     | 
       (                   # (8 start), Left/Right fullwidth square bracket
          ［
          (?>
             [^［］]++ 
           | (?8) 
          )*
          ］
       )                   # (8 end)
    )
    (*SKIP) (*FAIL) 
  | 
    [\S\s] 
 )

网友

2楼 · 编辑于 2024-09-30 14:35:47

my $string = 'hello (hi that [is] so cool) awesome {yeah} <and <then> some (even {more})>';
1 while $string =~ s/\([^(]*?\) *//;  #remove all ()
1 while $string =~ s/\[[^\[]*?\] *//; #remove all []
1 while $string =~ s/\{[^{]*?\} *//;  #remove all {}
1 while $string =~ s/<[^<]*?> *//;    #remove all <>
print "What is left now: $string\n";  #hello awesome

或一体式：

1 while $string=~s/( \([^(]*?\) | \[[^[]*?\] | \{[^{]*?\} | <[^<]*?>  ) \s*//xg;

网友

3楼 · 编辑于 2024-09-30 14:35:47

这涉及到处理匹配分隔符（可能是嵌套分隔符）的棘手问题

我建议使用coreText::Balanced来解析所有平衡（顶级）括号之外的文本字符串，而不是纠缠一个大正则表达式，这正是问题中所描述的

use warnings;
use strict;
use feature 'say';

use Text::Balanced qw(extract_bracketed);

my $string = 'hello (hi that [is] so cool) awesome {yeah}';

my @outside_of_brackets;

my ($match, $before);
my $remainder = $string;
while (1) {
    ($match, $remainder, $before) = extract_bracketed(
        $remainder, '(){}[]', '[^({[]*'
    );
    push @outside_of_brackets, $before // $remainder;
    last if not defined $match; 
}

say for @outside_of_brackets;

我们要求找到任何给定括号的第一个顶级对的内容，^†，同时我们得到该对（$remainder）后面的内容和之前的内容

这里需要的是$before，我们一直以同样的方式解析$remainder，挑选$before，直到没有更多的匹配；此时$remainder中没有括号，因此我们也接受它（此时$before也必须为空）

代码获得预期的字符串，带有一些额外的空白；根据需要修剪

有关另一个示例以及使用Regexp::Common的另一种方法，请参见this post

^†Theextract_bracketed提取第一个顶级平衡括号对中的内容，默认情况下需要在字符串开头（可能的空格之后）或上一个匹配结束后找到该括号；或者，在第三个参数中的模式（如果给定）之后，必须找到（因此这里的*量词，如果括号在开头是）

因此，在本例中，它跳到第一个开口括号，然后解析字符串以查找平衡括号对。第二个参数给出了要查找的括号类型

相关问题更多 >

编程相关推荐

热门问题

热门文章