我一直试图从一个文本文件中提取一个唯一的数据名列表,但我似乎做不到,因为我对regex一无所知。在
如果我们有一个例子:
[Friday 17/10/2014 @ 07:30:55] The user user01 | account01 | namename1 has been granted access.
[Friday 17/10/2014 @ 07:30:57] The user user two | account_two | name2 has been granted access.
[Friday 17/10/2014 @ 07:30:59] The user user_three | account_ | name3 here3 has been granted access.
[Friday 17/10/2014 @ 07:31:41] The user user01 | account01 | namename1 has been granted access.
我希望它基本上找到两个管道之间的帐户信息|
,并去掉管道和空白,这样它就可以在文本文件中输出一个包含以下内容的列表,在它遍历并删除任何重复项之后,它将严格地说是一个纯列表
它必须做的一项检查是确保它只在行中包含短语has been granted access.
时获取帐户信息,因为数据可能看起来像:
[Friday 17/10/2014 @ 07:30:55] The user user01 | account01 | namename1 has been granted access.
[Friday 17/10/2014 @ 07:30:57] The user user two | account_two | name2 has been granted access.
[Friday 17/10/2014 @ 07:30:59] Details Granted | user two | access number 01239
[Friday 17/10/2014 @ 07:30:59] The user user_three | account_ | name3 here3 has been granted access.
[Friday 17/10/2014 @ 07:31:41] The user user01 | account01 | namename1 has been granted access.
我不希望它从该示例的第3行获取帐户信息user two
。在
有谁能帮我做一些代码的例子吗?我们将不胜感激。在
这段代码需要注意:
如果您想在命令行上执行它,只需将这两行与shebang放在一个.py文件中,如下所示(搜索.py)公司名称:
^{pr2}$然后像这样跑:
或者:
如果您有很多帐户,您可能希望每个帐户只打印一次,并在一行单独打印:
我完全忽视了斯普利特。。。但以下是一个基于使用split的完全有效的版本:
按
|
拆分并选择拆分的第二部分,然后去掉所有空白,然后通过检查帐户是否不在列表中生成一个accountlist,这样可以删除重复项最后但并非最不重要的是,它会将所有帐户输出到输出.txt在
相关问题 更多 >
编程相关推荐