<p>逐行读取文件可能更好。这样,如果文件太大,您就不会遇到内存过载的问题,而且您还可以对行本身运行4位检查,而不会出现尴尬的拆分。你知道吗</p>
<pre><code>doc = 0
towrite = ""
with open("somefile.txt", "r") as f:
for i, line in enumerate(f):
if len(line.strip()) == 4 and line.strip().isdigit():
if i > 0: # write txt from prior parse
wfile = open("{}.txt".format(doc), "w")
wfile.write(towrite)
wfile.close()
doc = line.strip()
towrite = "" # reset
else:
towrite += line
wfile = open("{}.txt".format(doc), "w")
wfile.write(towrite)
wfile.close()
</code></pre>
<p>测试文件:</p>
<pre><code>1234
43267583291483 1234 3213213
57489367483929 32133248 3728913
3267
32163721837362 4723 3291832
42189323471911 321113 3211111132
326189183828327 3218484828283 828238281
21838282387 3726173 6278
1111
1236274818 327813678
32167382167894829013 321
</code></pre>
<p>结果:</p>
<p><strong>1234.txt文件</p>
<pre><code>43267583291483 1234 3213213
57489367483929 32133248 3728913
</code></pre>
<p><strong>3267.txt文件</p>
<pre><code>32163721837362 4723 3291832
42189323471911 321113 3211111132
326189183828327 3218484828283 828238281
21838282387 3726173 6278
</code></pre>
<p><strong>1111.txt</strong></p>
<pre><code>1236274818 327813678
32167382167894829013 321
</code></pre>