回答此问题可获得 20 贡献值,回答如果被采纳可获得 50 分。
<p>我有一个日志文件,我正在试图分析。日志文件示例如下:</p>
<blockquote>
<p>Oct 23 13:03:03.714012 prod1_xyz(RSVV)[201]: #msgtype=EVENT #server=Web/Dev@server1web #func=LKZ_WriteData ( line 2992 ) #rc=0 #msgid=XYZ0064 #reqid=0 #msg=Web Activity end (section 200, # SysD 1, Files 222, Bytes 343422089928, Errors 0, Aborted Files 0, Busy Files 0)</p>
</blockquote>
<p>我想取出所有以散列开头的文本,并且有一个键和值。例如,#msgtype=EVENT。任何只有散列且没有“=”符号的文本都将被视为值</p>
<p>所以在上面的日志条目中,我想要一个如下所示的列表</p>
<pre><code>#msgtype=EVENT
#server=Web/Dev@server1web
#func=LKZ_WriteData ( line 2992 )
#rc=0
#msgid=XYZ0064
#reqid=0
#msg=Web Activity end (section 200, # SysD 1, Files 222, Bytes 343422089928, Errors 0, Aborted Files 0, Busy Files 0) (Notice the hash present in the middle of the text)
</code></pre>
<p>我已经尝试了Python regex findall选项,但无法捕获所有数据</p>
<p>例如:</p>
<pre><code>str='Oct 23 13:03:03.714012 prod1_xyz(RSVV)[201]: #msgtype=EVENT #server=Web/Dev@server1web #func=LKZ_WriteData ( line 2992 ) #rc=0 #msgid=XYZ0064 #reqid=0 #msg=Web Activity end (section 200, # SysD 1, Files 222, Bytes 343422089928, Errors 0, Aborted Files 0, Busy Files 0)'
z = re.findall("(#.+?=.+?)(:?#|$)",str)
print(z)
</code></pre>
<p>输出:</p>
<pre><code>[('#msgtype=EVENT ', '#'), ('#func=LKZ_WriteData ( line 2992 ) ', '#'), ('#msgid=XYZ0064 ', '#'), ('#msg=Web Activity end (section 200, ', '#')]
</code></pre>