<p>CPython(引用解释器)对就地字符串连接进行了优化(当附加到的字符串没有其他引用时)。在执行<code>+</code>时,它无法可靠地应用此优化,只有<code>+=</code>(<code>+</code>涉及两个实时引用,赋值目标和操作数,而前者不涉及{<cd1>}操作,因此更难对其进行优化)。在</p>
<p>但您不应该依赖于此,根据<a href="https://www.python.org/dev/peps/pep-0008/#programming-recommendations" rel="noreferrer">PEP 8</a>:</p>
<blockquote>
<p>Code should be written in a way that does not disadvantage other implementations of Python (PyPy, Jython, IronPython, Cython, Psyco, and such).</p>
<p>For example, do not rely on CPython's efficient implementation of in-place string concatenation for statements in the form a += b or a = a + b . This optimization is fragile even in CPython (it only works for some types) and isn't present at all in implementations that don't use refcounting. In performance sensitive parts of the library, the ''.join() form should be used instead. This will ensure that concatenation occurs in linear time across various implementations.</p>
</blockquote>
<p><strong>根据问题编辑进行更新</strong>:是的,您破坏了优化。连接了许多字符串,而不仅仅是一个字符串,Python从左到右求值,因此它必须首先执行最左边的连接。因此:</p>
<pre><code>new_file_content += line.strip() + "~" + row_region + "\n"
</code></pre>
<p>完全不同于:</p>
^{pr2}$
<p>因为前者将所有的<em>新的</em>片段连接在一起,然后一次将它们附加到累加器字符串中,而后者必须使用不涉及<code>new_file_content</code>本身的临时表从左到右计算每个加法。为了清晰起见,添加了parens,就像你做的那样:</p>
<pre><code>new_file_content = (((new_file_content + line.strip()) + "~") + row_region) + "\n"
</code></pre>
<p>因为在到达类型之前,它实际上不知道类型,所以它不能假设所有这些都是字符串,所以优化不会起作用。在</p>
<p>如果您将第二位代码更改为:</p>
<pre><code>new_file_content = new_file_content + (line.strip() + "~" + row_region + "\n")
</code></pre>
<p>或者稍微慢一点,但仍然比慢代码快很多倍,因为它保持了CPython优化:</p>
<pre><code>new_file_content = new_file_content + line.strip()
new_file_content = new_file_content + "~"
new_file_content = new_file_content + row_region
new_file_content = new_file_content + "\n"
</code></pre>
<p>所以对于CPython来说,积累是显而易见的,您可以解决性能问题。但坦率地说,您应该在执行这样的逻辑追加操作时使用<code>+=</code>;<code>+=</code>是有原因的,它为维护者和解释器提供了有用的信息。除此之外,就<a href="https://en.wikipedia.org/wiki/DRY" rel="noreferrer">DRY</a>而言,这是一个很好的实践;为什么不需要为变量命名两次呢?在</p>
<p>当然,根据PEP8指南,即使在这里使用<code>+=</code>也是不好的形式。在大多数具有不可变字符串的语言中(包括大多数非cpythonpython解释器),重复的字符串连接是<a href="https://en.wikipedia.org/wiki/Joel_Spolsky#Schlemiel_the_Painter.27s_algorithm" rel="noreferrer">Schlemiel the Painter's Algorithm</a>的一种形式,这会导致严重的性能问题。正确的解决方案是构建一个<code>list</code>的字符串,然后<code>join</code>将它们放在一起,例如:</p>
<pre><code> new_file_content = []
for i, line in enumerate(content):
if i==0:
# In local tests, += anonymoustuple runs faster than
# concatenating short strings and then calling append
# Python caches small tuples, so creating them is cheap,
# and using syntax over function calls is also optimized more heavily
new_file_content += (line.strip(), "~region\n")
else:
country = line.split("~")[13]
try:
row_region = regions[country]
except KeyError:
row_region = "Undetermined"
new_file_content += (line.strip(), "~", row_region, "\n")
# Finished accumulating, make final string all at once
new_file_content = "".join(new_file_content)
</code></pre>
<p>这通常在CPython字符串连接选项可用时更快,而且在非CPython Python解释器上也会非常快,因为它使用可变的<code>list</code>来有效地累积结果,然后允许<code>''.join</code>预计算字符串的总长度,一次分配最后一个字符串(而不是沿途增量调整大小),然后只填充一次。在</p>
<p>旁注:对于您的特定情况,您根本不应该累积或串联。您有一个输入文件和一个输出文件,可以逐行处理。每次添加或积累文件内容时,只需将它们写出来(我在编写过程中为PEP8遵从性和其他一些小的样式改进整理了代码):</p>
<pre><code>start_time = time.monotonic() # You're on Py3, monotonic is more reliable for timing
# Use with statements for both input and output files
with open(fname) as f, open("CMDB_STAGE060.csv", "w") as new_file:
# Iterate input file directly; readlines just means higher peak memory use
# Maintaining your own counter is silly when enumerate exists
for i, line in enumerate(f):
if not i:
# Write to file directly, don't store
new_file.write(line.strip() + "~region\n")
else:
country = line.split("~")[13]
# .get exists to avoid try/except when you have a simple, constant default
row_region = regions.get(country, "Undetermined")
# Write to file directly, don't store
new_file.write(line.strip() + "~" + row_region + "\n")
end_time = time.monotonic()
# Print will stringify arguments and separate by spaces for you
print("total time:", end_time - start_time)
</code></pre>
<h2>实施细节深入研究</h2>
<p>对于那些好奇于实现细节的人来说,CPython字符串concat优化是在字节码解释器中实现的,而不是在<code>str</code>类型本身上实现的(技术上,<code>PyUnicode_Append</code>执行变异优化,但它需要解释器的帮助来修复引用计数,以便它知道它可以安全地使用优化;如果没有解释器的帮助,只有C扩展模块才能从优化中受益)。在</p>
<p>当解释器<a href="https://hg.python.org/cpython/file/3.5/Python/ceval.c#l1763" rel="noreferrer">detects that both operands are the Python level ^{<cd13>} type</a>(在C层,在Python3中,它仍然被称为<code>PyUnicode</code>,这是一个2.x天的遗留问题,不值得更改),它调用<a href="https://hg.python.org/cpython/file/3.5/Python/ceval.c#l5278" rel="noreferrer">a special ^{<cd17>} function</a>,它检查下一条指令是否是三条基本<code>STORE_*</code>指令之一。如果是,并且目标与左操作数相同,则会清除目标引用,这样<code>PyUnicode_Append</code>将只看到对操作数的单个引用,从而允许它使用单个引用调用<code>str</code>的优化代码。在</p>
<p>这意味着不你不仅可以通过</p>
<pre><code>a = a + b + c
</code></pre>
<p>当相关变量不是顶级(全局、嵌套或局部)名称时,也可以中断它。如果您在操作一个object属性、<code>list</code>索引、<code>dict</code>值等,甚至<code>+=</code>也帮不了你,它不会看到“简单的<code>STORE</code>”,因此它不会清除目标引用,所有这些都会导致超流、不到位行为:</p>
<pre><code>foo.x += mystr
foo[0] += mystr
foo['x'] += mystr
</code></pre>
<p>它还特定于<code>str</code>类型;在python2中,优化对<code>unicode</code>对象没有帮助,在python3中,它对<code>bytes</code>对象没有帮助,在这两个版本中,它都不会为<code>str</code>的子类进行优化;这些子类总是走慢路。在</p>
<p>基本上,对于不熟悉Python的人来说,在最简单的常见情况下,优化是尽可能好的,但是对于稍微复杂的情况,优化不会带来严重的麻烦。这只是强化了PEP8的建议:如果您可以通过正确的操作并使用<code>str.join</code>在每个</em>解释器上对任何存储目标运行得更快,那么取决于解释器的实现细节是个坏主意。在</p>