<p>在不知道垃圾数据的大小或扫描垃圾数据的情况下,您无法直接查找它。但是,将文件包装在<a href="https://docs.python.org/3/library/itertools.html#itertools.dropwhile" rel="nofollow">^{<cd1>}</a>中以丢弃行并不难,直到您看到“良好”的数据,之后它将遍历所有剩余的行:</p>
<pre><code>import itertools
# Or def a regular function that returns True until you see the line
# delimiting the beginning of the "good" data
not_good = '# The stuff I care about\n'.__ne__
with open(filename) as f:
for line in itertools.dropwhile(not_good, f):
... You'll iterate the lines at and after the good line ...
</code></pre>
<p>如果您确实需要适当定位文件描述符,而不仅仅是行,那么这个变量应该可以工作:</p>
^{pr2}$
<p>如果您真的需要(而不是仅仅需要偏移量),您可以调整它以获得实际的行号。但是它的可读性稍差,因此如果需要,通过<code>enumerate</code>显式迭代可能更有意义(留作练习)。让Python为您工作的方法是:</p>
<pre><code>from future_builtins import map # Py2 only
from operator import itemgetter
with open(filename) as f:
linectr = itertools.count()
# Get first good line
# Pair each line with a 0-up number to advance the count generator, but
# strip it immediately so not_good only processes lines, not line nums
good_start = next(itertools.dropwhile(not_good, map(itemgetter(0), zip(f, linectr))))
good_lineno = next(linectr) # Keeps the 1-up line number by advancing once
# Seek back to undo the read of the first good line:
f.seek(-len(good_start), io.SEEK_CUR)
# f is now positioned at the beginning of the line that begins the good data
</code></pre>