为什么是Pandas.dataframe.groupby先分配给变量更快？

%timeit somedf.groupby('someBoolColumn')['someBoolColumn'].count() 484 µs ± 9.52 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) %%timeit grp = somedf.groupby('someBoolColumn') grp['someBoolColumn'].count() 146 µs ± 1.47 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

1条回答

网友

1楼 · 发布于 2024-09-27 23:15:52

Ipython的^{} docs 状态：

In cell mode, the statement in the first line is used as setup code (executed but not timed) and the body of the cell is timed. The cell body has access to any variables created in the setup code.

（我的重点）。cell mode是通过使用%%timeit的双百分比形式触发的。当您在IPython提示符处键入%magic时，IPython在文档中也会打印出一个简介：

%%timeit x = numpy.random.randn((100, 100))
numpy.linalg.svd(x)
will time the execution of the numpy svd routine, running the assignment of x as part of the setup phase, which is not timed.

因此

%%timeit grp = somedf.groupby('someBoolColumn')
grp['someBoolColumn'].count()

是计时grp['someBoolColumn'].count()，但不是赋值grp = somedf.groupby('someBoolColumn')。在

如何在没有设置行的情况下使用%%timeit：

要使用%%timeit对两个语句计时，只需将%%timeit后的第一行留空：

^{pr2}$

输入两次Enter即可完成单元格。在

相关问题更多 >

编程相关推荐

热门问题

热门文章