<p>最好的方法是使用<code>re</code></p>
<pre><code>s = '''asfdsafadfa "Tabvxc "avcx"sdasaf" sadasfdf. sdsadsaf '0000000000000000000000000000000'." is something'''''
pat = re.compile(
r'''
^ # beginning of a line
(.*?) # first part. the *? means non-greedy
(".*") # part between the outermost ", ("-included)
(.*?) # last part
$ # end of a line
''', re.DOTALL|re.VERBOSE)
</code></pre>
<blockquote>
<pre><code>pat.match(s).groups()
</code></pre>
</blockquote>
^{pr2}$
<p>总的来说,这将变成:</p>
<pre><code>test_str = '''asfdsafadfa "Tabvxc "avcx"sdasaf" sadasfdf. sdsadsaf '0000000000000000000000000000000'." is something
asfdsafadfa "Tabvxc "avcx"sdasaf" sadasfdf. sdsadsaf '0000000000000000000000000000000'."
asfdsafadfa Tabvxc avcxsdasaf sadasfdf. sdsadsaf '0000000000000000000000000000000'.
'''
def split_lines(filehandle):
pat = re.compile(r'''^(.*?)(".*")(.*?)$''', re.DOTALL)
for line in filehandle:
match = pat.match(line)
if match:
yield match.groups()
else:
yield line
with StringIO(test_str) as openfile:
for line in split_lines(openfile):
print(line)
</code></pre>
<p>第一个生成器将打开的文件句柄分成不同的行。然后它试图分割线。如果成功,则生成一个包含不同部分的元组,否则将生成原始字符串。在</p>
<p>在实际的程序中,可以将<code>StringIO(test_str)</code>替换为<code>open(filename, 'r')</code></p>
<blockquote>
<pre><code>('asfdsafadfa ', '"Tabvxc "avcx"sdasaf" sadasfdf. sdsadsaf \'0000000000000000000000000000000\'."', ' is something')
('asfdsafadfa ', '"Tabvxc "avcx"sdasaf" sadasfdf. sdsadsaf \'0000000000000000000000000000000\'."', '')
asfdsafadfa Tabvxc avcxsdasaf sadasfdf. sdsadsaf '0000000000000000000000000000000'.
</code></pre>
</blockquote>