<p>首先,问题只显示在第3,4,5行的原因是<code>skip_header</code>,<code>skip_footer</code></p>
<p>在没有<code>skip_footer</code>的情况下:</p>
<pre><code>import numpy as np
data = np.genfromtxt('example.txt',
skip_header=1,
names=True,
dtype=None,
delimiter='|',
encoding='utf-8',
filling_values=None)
</code></pre>
<p>错误:</p>
<pre><code> Line #3 (got 14 columns instead of 13)
Line #4 (got 14 columns instead of 13)
Line #5 (got 14 columns instead of 13)
Line #6 (got 14 columns instead of 13)
Line #7 (got 14 columns instead of 13)
</code></pre>
<p>因此,首先,<code>skip_header</code>值应该是0。
结果:</p>
<pre><code>data = np.genfromtxt('example.txt',
names=True,
dtype=None,
delimiter='|',
encoding='utf-8',
filling_values=None)
</code></pre>
<p>结果:</p>
<pre><code>array([(False, 5818221, '2021-03-15T18:18:20+01:00', '2021-03-15', 'LOGIN', 'SESSION-EXPIRE', False, 'qweqwewqewqewqe', 'qweqewqewqwqeqw', 'STANDARD', 'lAkpligg11Ds9nJGFRPdeD', 'qwewqeq', '', 'weqeqewqewe'),
(False, 5818222, '2021-03-15T18:18:20+01:00', '2021-03-15', 'LOGOUT', 'SESSION-EXPIRE', False, 'qweqwewqewqewqe', 'qweqewqewqwqeqw', 'STANDARD', 'lAkpligg11Ds9nJGFRPdeD', '', 'qweqe', 'weqeqewqewe'),
(False, 5818222, '2021-03-15T18:18:20+01:00', '2021-03-15', 'LOGOUT', 'SESSION', False, 'qweqwewqewqewqe', 'qweqewqewqwqeqw', 'STANDARD', 'lAkpligg11Ds9nJGFRPdeD', 'qweqe', '', 'weqeqewqewe'),
(False, 5818221, '2021-03-15T18:18:20+01:00', '2021-03-15', 'LOGOUT', 'SESSION-EXPIRE', False, 'qweqwewqewqewqe', 'qweqewqewqwqeqw', 'STANDARD', 'lAkpligg11Ds9nJGFRPdeD', '', '', 'weqeqewqewe'),
(False, 5818221, '2021-03-15T18:18:20+01:00', '2021-03-15', 'LOGIN', 'SESSION-EXPIRE', False, 'qweqwewqewqewqe', 'qweqewqewqwqeqw', 'STANDARD', 'lAkpligg11Ds9nJGFRPdeD', '', '', 'weqeqewqewe'),
(False, 5818221, '2021-03-15T18:18:20+01:00', '2021-03-15', 'LOGIN', 'SESSION-EXPIRE', False, 'qweqwewqewqewqe', 'qweqewqewqwqeqw', 'STANDARD', 'lAkpligg11Ds9nJGFRPdeD', 'qweqwe', 'wqewqe', 'weqeqewqewe')],
dtype=[('ID', '?'), ('TIMESTAMP', '<i4'), ('EVENT_DATE', '<U25'), ('GROUP', '<U10'), ('EVENT', '<U6'), ('CHANNEL', '<U14'), ('WERT', '?'), ('WERTY', '<U15'), ('WERTY_1', '<U15'), ('SESSION_ID', '<U8'), ('IP', '<U22'), ('WERT_1', '<U7'), ('DATA', '<U6'), ('f0', '<U11')])
</code></pre>
<p>第一列值<code>False</code>和<code>dtype</code>错误的原因是txt文件的第一行包含的分隔符比其他行多</p>
<pre><code>>>>line0= "|ID|TIMESTAMP|EVENT_DATE|GROUP|EVENT|CHANNEL|WERT|WERTY|WERTY|SESSION_ID|IP|WERT|DATA|"
>>>line1 =
"|5818221|2021-03-15T18:18:20+01:00|2021-03-15|LOGIN|SESSION-EXPIRE||qweqwewqewqewqe|qweqewqewqwqeqw|STANDARD|lAkpligg11Ds9nJGFRPdeD|qwewqeq||weqeqewqewe"
>>>delimiter = '|'
>>>line0.count(delimiter)
14
>>>line1.count(delimiter)
13
</code></pre>
<p>解决方案:
对于1个分隔符,我们有2个信息,这里有13个信息,所以我们只需要12个分隔符,最后:
txt文件:</p>
<pre><code>ID|TIMESTAMP|EVENT_DATE|GROUP|EVENT|CHANNEL|WERT|WERTY|WERTY|SESSION_ID|IP|WERT|DATA
5818221|2021-03-15T18:18:20+01:00|2021-03-15|LOGIN|SESSION-EXPIRE||qweqwewqewqewqe|qweqewqewqwqeqw|STANDARD|lAkpligg11Ds9nJGFRPdeD|qwewqeq||weqeqewqewe
5818222|2021-03-15T18:18:20+01:00|2021-03-15|LOGOUT|SESSION-EXPIRE||qweqwewqewqewqe|qweqewqewqwqeqw|STANDARD|lAkpligg11Ds9nJGFRPdeD||qweqe|weqeqewqewe
5818222|2021-03-15T18:18:20+01:00|2021-03-15|LOGOUT|SESSION||qweqwewqewqewqe|qweqewqewqwqeqw|STANDARD|lAkpligg11Ds9nJGFRPdeD|qweqe||weqeqewqewe
5818221|2021-03-15T18:18:20+01:00|2021-03-15|LOGOUT|SESSION-EXPIRE||qweqwewqewqewqe|qweqewqewqwqeqw|STANDARD|lAkpligg11Ds9nJGFRPdeD|||weqeqewqewe
5818221|2021-03-15T18:18:20+01:00|2021-03-15|LOGIN|SESSION-EXPIRE||qweqwewqewqewqe|qweqewqewqwqeqw|STANDARD|lAkpligg11Ds9nJGFRPdeD|||weqeqewqewe
5818221|2021-03-15T18:18:20+01:00|2021-03-15|LOGIN|SESSION-EXPIRE||qweqwewqewqewqe|qweqewqewqwqeqw|STANDARD|lAkpligg11Ds9nJGFRPdeD|qweqwe|wqewqe|weqeqewqewe
</code></pre>
<p>代码:</p>
<pre><code>data = np.genfromtxt('d2.txt',names=True,dtype=None,delimiter='|',encoding='utf-8',filling_values=None,skip_header=0)
</code></pre>
<p>结果:</p>
<pre><code>array([(5818221, '2021-03-15T18:18:20+01:00', '2021-03-15', 'LOGIN', 'SESSION-EXPIRE', False, 'qweqwewqewqewqe', 'qweqewqewqwqeqw', 'STANDARD', 'lAkpligg11Ds9nJGFRPdeD', 'qwewqeq', '', 'weqeqewqewe'),
(5818222, '2021-03-15T18:18:20+01:00', '2021-03-15', 'LOGOUT', 'SESSION-EXPIRE', False, 'qweqwewqewqewqe', 'qweqewqewqwqeqw', 'STANDARD', 'lAkpligg11Ds9nJGFRPdeD', '', 'qweqe', 'weqeqewqewe'),
(5818222, '2021-03-15T18:18:20+01:00', '2021-03-15', 'LOGOUT', 'SESSION', False, 'qweqwewqewqewqe', 'qweqewqewqwqeqw', 'STANDARD', 'lAkpligg11Ds9nJGFRPdeD', 'qweqe', '', 'weqeqewqewe'),
(5818221, '2021-03-15T18:18:20+01:00', '2021-03-15', 'LOGOUT', 'SESSION-EXPIRE', False, 'qweqwewqewqewqe', 'qweqewqewqwqeqw', 'STANDARD', 'lAkpligg11Ds9nJGFRPdeD', '', '', 'weqeqewqewe'),
(5818221, '2021-03-15T18:18:20+01:00', '2021-03-15', 'LOGIN', 'SESSION-EXPIRE', False, 'qweqwewqewqewqe', 'qweqewqewqwqeqw', 'STANDARD', 'lAkpligg11Ds9nJGFRPdeD', '', '', 'weqeqewqewe'),
(5818221, '2021-03-15T18:18:20+01:00', '2021-03-15', 'LOGIN', 'SESSION-EXPIRE', False, 'qweqwewqewqewqe', 'qweqewqewqwqeqw', 'STANDARD', 'lAkpligg11Ds9nJGFRPdeD', 'qweqwe', 'wqewqe', 'weqeqewqewe')],
dtype=[('ID', '<i4'), ('TIMESTAMP', '<U25'), ('EVENT_DATE', '<U10'), ('GROUP', '<U6'), ('EVENT', '<U14'), ('CHANNEL', '?'), ('WERT', '<U15'), ('WERTY', '<U15'), ('WERTY_1', '<U8'), ('SESSION_ID', '<U22'), ('IP', '<U7'), ('WERT_1', '<U6'), ('DATA', '<U11')])
</code></pre>