为什么这个二进制堆的实现比Python的stdlib慢？

def percDown(lst, start, end, node): #Moves given node down the heap, starting at index start, until the heap property is #satisfied (all children must be larger than their parent) iChild = 2 * start + 1 i = start # if the node has reached the end of the heap (i.e. no children left), # return its index (we are done) if iChild > end - 1: return start #if the second child exists and is smaller than the first child, use that child index #for comparing later if iChild + 1 < end and lst[iChild + 1] < lst[iChild]: iChild += 1 #if the smallest child is less than the node, it is the new parent if lst[iChild] < node: #move the child to the parent position lst[start] = lst[iChild] #continue recursively going through the child nodes of the # new parent node to find where node is meant to go i = percDown(lst, iChild, end, node) return i

def heapify(lst): iLastParent = math.floor((len(lst) - 1) / 2) length = len(lst) while iLastParent >= 0: ele = lst[iLastParent] i = percDown(lst, iLastParent, length, lst[iLastParent]) lst[i] = ele iLastParent -= 1

2条回答

网友

1楼 · 编辑于 2024-10-01 04:56:54

0.6s与2.6s的差别比4x小。是不是太大了？在

没有足够的信息来回答。一个4倍的差异肯定是由算法错误引起的……但是如果不进行不同大小的测试，真的没有办法分辨。在

比如说，如果在1000时只有1.2倍的差异，100000时只有4倍的差异，1000000时只有12倍的差异，那么你的算法复杂度很可能会更差，这意味着你很可能出了问题，这是你需要解决的问题。在

另一方面，如果这三种尺寸都相差4倍，那么在你的开销中就有一个更大的常数乘数。很可能是因为您已经在Python中运行了一个内部循环，而（CPython）stdlib版本使用的是在C中执行相同循环的_heapq加速器模块，如Martijn Pieters' answer中所述。所以，你没做错什么。你也许可以进行一些微优化，但最终你必须要么用C重写代码的核心，要么在JIT优化的解释器中运行它，以接近stdlib的性能。实际上，如果你只是为了理解算法而写这篇文章，你不需要这么做。在

另外，您可能需要尝试在PyPy中运行比较。它的stdlib大部分是用纯Python编写的，没有特殊的优化，但是经过优化的JIT编译器使它几乎与CPython中的本机C代码一样快。同样的JIT也会应用到你的代码中，这意味着你未优化的代码通常也和CPython中的本机C代码一样。当然，这并不能保证这一点，也不能改变这样一个事实：如果你想测试算法复杂度，你总是需要以不同的大小进行测试。在

网友

2楼 · 编辑于 2024-10-01 04:56:54

Pythonheapq模块使用C扩展。你不能打败C代码。在

从^{} module source code：

# If available, use C implementation
try:
    from _heapq import *
except ImportError:
    pass

另请参见^{} C source。在

相关问题更多 >

编程相关推荐

热门问题

热门文章