REGEX(使用python3.5)在fi中查找字符串

2024-10-03 17:15:42 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在打开一个.msg outlook文件,需要从中提取一些特定的数据。我对regex还是有点陌生,很难找到我需要的东西。你知道吗

以下是文件中的数据,其中包含一些选项卡,仅供参考:

NEWS ID:    918273/1
TITLE:  News Platform Solution Overview (CNN) (US English Session)
ACCOUNT:    supernewsplatformacct (55712)

Your request has been completed.

Output Format   MP4

Please click on the "Download File" link below to access the download page.

Download File <http://news.downloadwebsitefake.com/newsid/file1294757493292848575.mp4>

我需要:

918273-从-NEWS ID: 918273/1

News Platform Solution Overview (CNN) (US English Session)-从-TITLE: News Platform Solution Overview (CNN) (US English Session)

supernewsplatformacct-从-ACCOUNT: supernewsplatformacct (55712)

http://news.downloadwebsitefake.com/newsid/file1294757493292848575.mp4-从-Download File <http://news.downloadwebsitefake.com/newsid/file1294757493292848575.mp4>

我在努力

[\n\r][ \t]*NEWS ID:[ \t]*([^\n\r]*)

但是运气不好。任何帮助都将不胜感激!你知道吗


Tags: idhttpenglishsessiondownloadoverviewcnnfile
2条回答
(?:^|(?<=\n))[^:<\n]*[:<](.*)

您可以将它与re.findall一起使用。请参阅演示。你知道吗

https://regex101.com/r/d7RPNB/2

msg = """NEWS ID:    918273/1
TITLE:  News Platform Solution Overview (CNN) (US English Session)
ACCOUNT:    supernewsplatformacct (55712)

Your request has been completed.

Output Format   MP4

Please click on the "Download File" link below to access the download page.

Download File <http://news.downloadwebsitefake.com/newsid/file1294757493292848575.mp4>"""
import re
regex = r'[^:]+:\s+(.*)$|[^<]+<([^>]+)>'
matches = [re.match(regex, i).group(1) or re.match(regex, i).group(2) for i in msg.split('\n') if i and re.match(regex, i)]
print(matches)

相关问题 更多 >