如何在文件中复制特定标记及其内容？

<tag1><style="1">"Lorem ipsum dolor...</style>"Lorem Ipsum dolor"</tag1><tagen1><style="1">"Lorem ipsum dolor...</style>"Lorem Ipsum dolor"</tagen1> <tag1>"Other Lorem ipsum Dolor"</tag1><tagen1>"Other Lorem ipsum Dolor"</tagen1><tag1>"Lorem ipsum DOLOR"</tag1><tagen1>"Lorem ipsum DOLOR"</tag1>

1条回答

网友

1楼 · 发布于 2024-09-30 10:29:02

使用正则表达式来完成这项任务可能不是最好的主意，但是如果我们必须部分地这样做，我们将从一个通用表达式开始来捕获标记，然后我们将编写问题的其余部分的脚本，如果我们愿意的话。你知道吗

应该有某种库/包可以帮助我们实现这一点。你知道吗

要捕获标签

(<([\w\s=\"]+?)>)|(<\/([\w]+?)>)|(.+?)

Demo

测试

# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility

import re

regex = r"(<([\w\s=\"]+?)>)|(<\/([\w]+?)>)|(.+?)"

test_str = "<tag1><blah =\"1\">\"sdfhds^\"</blah>\"*@$%\"</tag1><some_different_tag>\"aihdihihaif\"</some_different_tag><tag1>\"92763972649@^&^(@$<>ihagHGWi!_*#&)!#&@#$!^#\"</tag1><tag1>\"sdwuqioyewoqiuy&%*(&!*#%(*!#$!$ 4545___)0rrf\"</tag1>"

matches = re.finditer(regex, test_str, re.MULTILINE)

for matchNum, match in enumerate(matches, start=1):

    print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))

    for groupNum in range(0, len(match.groups())):
        groupNum = groupNum + 1

        print ("Group {groupNum} found at {start}-{end}: {group}".format(groupNum = groupNum, start = match.start(groupNum), end = match.end(groupNum), group = match.group(groupNum)))

# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.

正则表达式电路

jex.im可视化正则表达式：

正则表达式

如果不需要此表达式并且您希望修改它，请访问regex101.com上的此链接。你知道吗

Demo

测试

正则表达式电路

正则表达式

相关问题更多 >

编程相关推荐

热门问题

热门文章