<p>如果我理解正确,您希望根据缩进最小的行将文本拆分为平行图</p>
<p>我的方法如下。我将创建一个<a href="https://docs.python.org/3/library/collections.html#collections.defaultdict" rel="nofollow noreferrer">defaultdict</a>,其中包含构成缩进的空格的数量作为键,以及包含具有此缩进计数的行的所有索引的列表作为值:</p>
<pre><code>from collections import defaultdict
text = '''TextTextTextTextTextTextTextTextText1
TextTextTextTextTextTextTextTextText1
TextTextTextTextTextTextTextTextText2
TextTextTextTextTextTextTextTextText2
TextTextTextTextTextTextTextTextText2
TextTextTextTextTextTextTextTextText2
TextTextTextTextTextTextTextTextText2
TextTextTextTextTextTextTextTextText3
TextTextTextTextTextTextTextTextText3
TextTextTextTextTextTextTextTextText3
TextTextTextTextTextTextTextTextText4
TextTextTextTextTextTextTextTextText4
TextTextTextTextTextTextTextTextText4'''
def count_indentation(line):
return len(line) - len(line.lstrip())
lines = text.splitlines(keepends=False)
indent_dict = defaultdict(list)
for idx, line in enumerate(lines):
if count_indentation(line) > 0:
indent_dict[count_indentation(line)].append(idx)
</code></pre>
<p>现在<code>indent_dict</code>看起来像:</p>
<pre><code>defaultdict(list, {8: [1, 3, 4, 5, 6, 8, 9, 11, 12], 4: [2, 7, 10]})
</code></pre>
<p>接下来,我们使用最小的键来查找相关行的索引:</p>
<pre><code>smallest_indent = min(indent_dict)
line_idexes_smallest_indents = indent_dict[smallest_indent]
</code></pre>
<p><code>line_idexes_smallest_indents</code>的结果是<code>[2, 7, 10]</code>。索引是基于零的,所以这就是为什么我的索引都比你的结果少一个。现在我们需要根据这些索引对原始文本进行分区</p>
<pre><code>def partition(lines, indices):
return [''.join(lines[i:j]) for i, j in zip([0]+indices, indices+[None])]
partition(lines, line_idexes_smallest_indents)
</code></pre>
<p>结果:</p>
<pre><code>['TextTextTextTextTextTextTextTextText1 TextTextTextTextTextTextTextTextText1',
' TextTextTextTextTextTextTextTextText2 TextTextTextTextTextTextTextTextText2 TextTextTextTextTextTextTextTextText2 TextTextTextTextTextTextTextTextText2 TextTextTextTextTextTextTextTextText2',
' TextTextTextTextTextTextTextTextText3 TextTextTextTextTextTextTextTextText3 TextTextTextTextTextTextTextTextText3',
' TextTextTextTextTextTextTextTextText4 TextTextTextTextTextTextTextTextText4 TextTextTextTextTextTextTextTextText4']
</code></pre>