我的python程序比java版本的同一程序执行得更快。什么给予？

import java.util.*; class SpeedTest { public static void main(String[] args) { long startTime; long totalTime; int iterations = 10000000; HashSet counts = new HashSet((2*iterations), 0.75f); startTime = System.currentTimeMillis(); for(int i=0; i<iterations; i++) { counts.add(i); } totalTime = System.currentTimeMillis() - startTime; System.out.println("TOTAL TIME = "+( totalTime/1000f) ); System.out.println(counts.size()); } }

3条回答

网友

1楼 · 编辑于 2024-10-01 19:16:15

另一个可能的解释是，Python中的集合是用C代码本机实现的，而Java中的HashSet是用Java本身实现的。因此，Python中的set应该天生就快得多。在

网友

2楼 · 编辑于 2024-10-01 19:16:15

实际上，您并不是在测试Java与Python的比较，而是在测试java.util.HashSet与Python的原生集和整数处理相比。在

显然，在这个特殊的微博客中，Python方面的速度确实更快。在

我尝试用GNU trove中的TIntHashSet替换HashSet，并获得了介于3到4之间的加速因子，使Java稍微领先于Python。在

真正的问题是您的示例代码是否真的如您所想的那样代表您的应用程序代码。您是否运行了一个探查器并确定大部分的CPU时间都花在将大量的int放入一个HashSet中？如果没有，这个例子就无关紧要了。即使唯一的区别是您的产品代码存储的是除int之外的其他对象，它们的创建和哈希代码的计算也很容易控制set插入（并完全破坏Python处理int的优势），使整个问题变得毫无意义。在

网友

3楼 · 编辑于 2024-10-01 19:16:15

我怀疑Python使用整数值本身作为散列值，而基于散列表的set实现直接使用该值。根据source中的注释：

This isn't necessarily bad! To the contrary, in a table of size 2**i, taking the low-order i bits as the initial table index is extremely fast, and there are no collisions at all for dicts indexed by a contiguous range of ints. The same is approximately true when keys are "consecutive" strings. So this gives better-than-random behavior in common cases, and that's very desirable.

对于Python来说，这个microbenchmark是一个最好的例子，因为它会导致完全零散列冲突。然而，如果Javas HashSet要重新散列密钥，它必须执行额外的工作，并且在发生碰撞时会出现更糟糕的行为。在

如果你在一个范围内随机洗牌在循环之前，即使在循环之外完成随机播放和列表创建，运行时也要慢2倍多。在

相关问题更多 >

编程相关推荐

热门问题

热门文章