<p>原因在文件对象的<code>next()</code>方法的文档中有解释(相当模糊):</p>
<blockquote>
<p>When a file is used as an iterator, typically in a for loop (for example,
for line in f: print line), the next() method is called repeatedly.
This method returns the next input line, or raises StopIteration when
EOF is hit. In order to make a for loop the most efficient way of looping
over the lines of a file (a very common operation), the next() method
uses a hidden read-ahead buffer. As a consequence of using a read-ahead
buffer, combining next() with other file methods (like readline()) does
not work right. However, using seek() to reposition the file to an
absolute position will flush the read-ahead buffer.</p>
</blockquote>
<p><code>tell()</code>返回的值反映了这个隐藏的预读缓冲区已达到的程度,通常比程序实际检索到的字符多几千字节。在</p>
<p>没有便携的方法来解决这个问题。如果需要将<code>tell()</code>与读取行混合使用,则使用文件的<code>readline()</code>方法。折衷是,作为获得可用的<code>tell()</code>结果的回报,使用<code>readline()</code>遍历一个大文件通常比使用<code>for line in file_object:</code>慢得多。在</p>
<h2>代码</h2>
<p>具体地说,将循环改为:</p>
<pre><code>line = self.fh.readline()
while line:
if p.search(line):
self.porSnipStartFPtr = self.fh.tell()
sys.stdout.write("found regPorSnip")
line = fh.readline()
</code></pre>
<p>我不确定这是否是您真正想要的:<code>tell()</code>正在捕获下一行的开始位置。如果需要线路的<em>开始</em>的位置,则需要更改逻辑,如下所示:</p>
^{pr2}$
<p>或者用“半圈”来做:</p>
<pre><code>while True:
pos = self.fh.tell()
line = self.fh.readline()
if not line:
break
if p.search(line):
self.porSnipStartFPtr = pos
sys.stdout.write("found regPorSnip")
</code></pre>