<p>好的,考虑到没有字符被替换或修改(如OP所述),我可以得出以下结论:</p>
<pre class="lang-py prettyprint-override"><code>first_input_no_newline = list(map(lambda x: (x[0], x[1].replace('\n', '')), first_input))
expected_output = []
for item in first_input_no_newline:
next_index = len(item[1])
second_input_copy = second_input
offset = 0
while True:
amount = second_input_copy[:next_index].count("\n")
if not amount:
next_index += offset
break
offset += amount
second_input_copy = second_input_copy.replace('\n', '', amount)
expected_output.append((item[0], second_input[:next_index]))
second_input = second_input[next_index:]
print(expected_output)
</code></pre>
<p>解释:你不必跟踪新线或类似的东西。此外,“first_input”中的换行符并不重要,因为我们在第二个输入中有所有换行符(加上更多换行符)</p>
<p>因此,只需获取<code>first_input_no_newline</code>的每个项目的长度,如果其中没有换行符,这也应该是<code>second_input</code>中的子字符串的长度,但是,如果有换行符,好的,只需继续计数并从第二个\u输入的副本中删除它们,并将此结果作为偏移量添加到原始第二个\u输入</p>
<p>输入示例(修复了OP的原始输入,在某些短语之间添加缺少的白色字符):</p>
<pre class="lang-py prettyprint-override"><code>first_input = [
(0, "Lorem ipsum dolor sit amet, consectetur"),
(1, " adipiscing elit"),
(0, ". In pellentesque\npharetra ex, at varius sem suscipit ac. "),
(-1, "Suspendisse luctus\ncondimentum velit a laoreet. "),
(0, "Donec dolor urna, tempus sed nulla vitae, dignissim varius neque.")
]
second_input = "Lorem ipsum dolor sit amet, \nconsectetur adipiscing elit. In pellentesque\npharetra ex, at varius sem \nsuscipit ac. Suspendisse luctus\ncondimentum velit a laoreet. Donec dolor urna, tempus sed \nnulla vitae, dignissim varius neque."
</code></pre>
<p>输出:</p>
<pre class="lang-py prettyprint-override"><code>[
(0, 'Lorem ipsum dolor sit amet, \nconsectetur'),
(1, ' adipiscing elit'),
(0, '. In pellentesque\npharetra ex, at varius sem \nsuscipit ac. '),
(-1, 'Suspendisse luctus\ncondimentum velit a laoreet. '),
(0, 'Donec dolor urna, tempus sed \nnulla vitae, dignissim varius neque.')
]
</code></pre>