为什么python dict更新速度非常慢？

import time LOOPS = 10000 WORD = 'ABC'*100 s1=time.time() buf1 = [] for i in xrange(LOOPS): buf1.append(WORD) ss = ''.join(buf1) s2=time.time() buf2 = '' for i in xrange(LOOPS): buf2 += WORD s3=time.time() buf3 = {'1':''} for i in xrange(LOOPS): buf3['1'] += WORD s4=time.time() buf4 = {'1':[]} for i in xrange(LOOPS): buf4['1'].append(WORD) buf4['1'] = ''.join(buf4['1']) s5=time.time() print s2-s1, s3-s2, s4-s3, s5-s4

1条回答

网友

1楼 · 发布于 2024-06-28 19:42:12

+=在构建大型字符串时执行得非常糟糕，但在CPython中有一种情况下是有效的。^{下面提到了}

为了更快地连接字符串，请使用str.join()。在

来自Python Performance Tips下的String Concatenation部分：

避免这种情况：

s = ""
for substring in list:
    s += substring

请改用s = "".join(list)。前者是一个非常普遍和灾难性的错误时，建立大型字符串。在

为什么`s += x`比`s['1'] += x`或{}快？

From Note 6：

CPython implementation detail: If s and t are both strings, some Python implementations such as CPython can usually perform an in-place optimization for assignments of the form s = s + t or s += t. When applicable, this optimization makes quadratic run-time much less likely. This optimization is both version and implementation dependent. For performance sensitive code, it is preferable to use the str.join() method which assures consistent linear concatenation performance across versions and implementations.

对于CPython的优化是，如果一个字符串只有一个引用，那么我们可以resize it in-place。在

/* Note that we don't have to modify *unicode for unshared Unicode objects, since we can modify them in-place. */

现在后两个不是简单的就地添加。事实上，这些都是不到位的补充。在

s[0] += x

相当于：

^{pr2}$

示例：

>>> lst = [1, 2, 3]
>>> def func():
...     lst[0] = 90
...     return 100
...
>>> lst[0] += func()
>>> print lst
[101, 2, 3]  # Not [190, 2, 3]

但通常不要使用s += x来连接字符串，而应该在字符串集合上使用str.join。在

计时

LOOPS = 1000
WORD = 'ABC'*100


def list_append():
    buf1 = [WORD for _ in xrange(LOOPS)]
    return ''.join(buf1)


def str_concat():
    buf2 = ''
    for i in xrange(LOOPS):
        buf2 += WORD


def dict_val_concat():
    buf3 = {'1': ''}
    for i in xrange(LOOPS):
        buf3['1'] += WORD
    return buf3['1']


def list_val_concat():
    buf4 = ['']
    for i in xrange(LOOPS):
        buf4[0] += WORD
    return buf4[0]


def val_pop_concat():
    buf5 = ['']
    for i in xrange(LOOPS):
        val = buf5.pop()
        val += WORD
        buf5.append(val)
    return buf5[0]


def val_assign_concat():
    buf6 = ['']
    for i in xrange(LOOPS):
        val = buf6[0]
        val += WORD
        buf6[0] = val
    return buf6[0]


>>> %timeit list_append()
1000 loops, best of 3: 1.31 ms per loop
>>> %timeit str_concat()
100 loops, best of 3: 3.09 ms per loop
>>> %run so.py
>>> %timeit list_append()
10000 loops, best of 3: 71.2 us per loop
>>> %timeit str_concat()
1000 loops, best of 3: 276 us per loop
>>> %timeit dict_val_concat()
100 loops, best of 3: 9.66 ms per loop
>>> %timeit list_val_concat()
100 loops, best of 3: 9.64 ms per loop
>>> %timeit val_pop_concat()
1000 loops, best of 3: 556 us per loop
>>> %timeit val_assign_concat()
100 loops, best of 3: 9.31 ms per loop

val_pop_concat在这里很快，因为通过使用pop()我们将从列表中删除对该字符串的引用，现在CPython可以适当地调整它的大小（由@niemmi in comments正确猜测）。在

为什么`s += x`比`s['1'] += x`或{}快？

相关问题更多 >

编程相关推荐

热门问题

热门文章