以某个字符结尾的正则表达式应避免该字符在之前多次出现 - 问答 - Python中文网

以某个字符结尾的正则表达式应避免该字符在之前多次出现

2024-09-25 10:25:19 发布

您现在位置：Python中文网/ 问答频道 /正文

男 | 程序猿一只，喜欢编程写python代码。

我有一个类似这样的数据

<workorder id = "124"
       issue = "broken hood"
       level = "minor"
       comment = " This will be some random text <imp>random text<imp>
         <role>Important<role> So this is goingto be fixed!"
>
</workorder> Some more random text

我需要从一开始就捕捉一切<；工作顺序“直到结束”>；'标签问题是，我的正则表达式在第二个imp标记“>；”处停止右括号。有关更多详细信息，请参见图

我正在使用regex101网站测试我的正则表达式，设置是Python，带有标志（全局、单行和多行）。单行基本上意味着。操作员也将匹配行的末尾

这是我的正则表达式

 *(<workorder.*?>$)(.?)

第一个星号前有一个空格。是否有一种方法可以捕获所有内容，直到“>；”在那之前

数据集也可能如下所示：这里是“>；”在“字符”旁边

<workorder id = "124"
       issue = "broken hood"
       level = "minor"
       comment = " This will be some random text <imp>random text<imp>
         <role>Important<role> So this is goingto be fixed!">
</workorder> Some more random text

还是像这样此处，“>；”位于/字符旁边

<workorder id = "124"
       issue = "broken hood"
       level = "minor"
       comment = " This will be some random text <imp>random text<imp>
         <role>Important<role> So this is going to be fixed!"/> 
Some more random text

还是像这样此处，“>；”位于/字符旁边，但在下一行中

<workorder id = "124"
       issue = "broken hood"
       level = "minor"
       comment = " This will be some random text <imp>random text<imp>
         <role>Important<role> So this is going to be fixed!"
/> 
Some more random text

Tags： text gt id comment random issue be this

2条回答

网友

1楼 · 编辑于 2024-09-25 10:25:19

也许您可以为此找到一个XML/HTML解析器。如果您想要正则表达式，可以尝试以下方法：

(<workorder[\s\S]*?(?:<\/workorder>|\/>))

演示here

在哪里

外部(...)-捕获结果
<workorder-匹配起始字符串
[\s\S]*?-以非贪婪的方式匹配任何字符，这样就不会跨越多个工作顺序组
(?:<\/workorder>|\/>)-匹配结尾字符串，无论它是</workorder>还是/>

网友

2楼 · 编辑于 2024-09-25 10:25:19

此PCRE正则表达式应有助于从提取数据<直到>

<.*>

旗帜应为：

g  > Global
i  > Case Insensitive
s  > Single Line

相关问题更多 >

编程相关推荐

热门问题

热门文章