我构建了两个函数major\u check\u with \u dataframe和major\u check\u with \u list,我想看看哪一个运行得更快。我对他们的跑步时间感到困惑
import numpy as np
DTYPE_FLOAT = np.float
import anapyfunc.major
import pandas as pd
from testpython.timer import timer
# wrapper function
def major_check(**kwargs):
#return major_check_with_list(**kwargs)
return major_check_with_dataframe(**kwargs)
# functions to be timed
def major_check_with_dataframe(df, s_major_single_hi_percent, s_major_single_lo_percent):
...
def major_check_with_list(source_list, s_major_single_hi_percent, s_major_single_lo_percent):
...
# main function starts here
t = timer.Timer(verbose = True, run = False)
t.set_name(name = 'major check timer')
a = np.random.choice(101, 2500)
b = np.random.choice(101, 2500)
c = np.random.choice(101, 2500)
s_major_single_hi_percent = 70
s_major_single_lo_percent = 10
dd = {'a' : a, 'b' : b , 'c' : c}
df = pd.DataFrame(dd)
# axis 0 = tick
# axis 1 = input arrays
t.set_name(name = 'major_check_with_dataframe')
t.start()
ret1 = anapyfunc.major.major_check_with_dataframe(
df = df,
s_major_single_hi_percent = s_major_single_hi_percent,
s_major_single_lo_percent = s_major_single_lo_percent,
)
t.stop_reset()
t.set_name(name = 'major_check_with_list')
t.start()
ret2 = anapyfunc.major.major_check_with_list(
source_list = [a, b, c,],
s_major_single_hi_percent = s_major_single_hi_percent,
s_major_single_lo_percent = s_major_single_lo_percent,
)
t.stop_reset()
t.set_name(name = 'major_check')
t.start()
ret3 = anapyfunc.major.major_check(
#source_list = [a, b, c,],
df = df,
s_major_single_hi_percent = s_major_single_hi_percent,
s_major_single_lo_percent = s_major_single_lo_percent,
)
t.stop_reset()
当major\u check调用带有数据帧的major\u check时输出
major_check_with_dataframe elapsed time: 94.261000 ms
major_check_with_list elapsed time: 2.316000 ms
major_check elapsed time: 3.055000 ms
当major\u check调用major\u check\u和\u list时输出
major_check_with_dataframe elapsed time: 95.042000 ms
major_check_with_list elapsed time: 2.240000 ms
major_check elapsed time: 2.240000 ms
我发现,如果第二次使用数据帧运行major\u check,它的运行时间将减少到与运行major\u check包装函数几乎相同的时间
t.set_name(name = 'major_check_with_dataframe')
t.start()
ret1 = anapyfunc.major.major_check_with_dataframe(
df = df,
s_major_single_hi_percent = s_major_single_hi_percent,
s_major_single_lo_percent = s_major_single_lo_percent,
)
t.stop_reset()
t.set_name(name = 'major_check_with_list')
t.start()
ret2 = anapyfunc.major.major_check_with_list(
source_list = [a, b, c,],
s_major_single_hi_percent = s_major_single_hi_percent,
s_major_single_lo_percent = s_major_single_lo_percent,
)
t.stop_reset()
t.set_name(name = 'major_check')
t.start()
ret3 = anapyfunc.major.major_check(
#source_list = [a, b, c,],
df = df,
s_major_single_hi_percent = s_major_single_hi_percent,
s_major_single_lo_percent = s_major_single_lo_percent,
)
t.stop_reset()
t.set_name(name = 'major_check_with_dataframe')
t.start()
ret1 = anapyfunc.major.major_check_with_dataframe(
df = df,
s_major_single_hi_percent = s_major_single_hi_percent,
s_major_single_lo_percent = s_major_single_lo_percent,
)
t.stop_reset()
t.set_name(name = 'major_check')
t.start()
ret3 = anapyfunc.major.major_check(
#source_list = [a, b, c,],
df = df,
s_major_single_hi_percent = s_major_single_hi_percent,
s_major_single_lo_percent = s_major_single_lo_percent,
)
t.stop_reset()
输出
major_check_with_dataframe elapsed time: 95.608000 ms
major_check_with_list elapsed time: 2.350000 ms
major_check elapsed time: 3.048000 ms
major_check_with_dataframe elapsed time: 2.569000 ms
major_check elapsed time: 2.520000 ms
会不会是某种内存缓存效应? 即使我将函数放在一个类中,使用class对象运行一次函数,并在每次运行后删除/垃圾收集它,这种行为也是一样的
所有函数都正确返回预期值。 我错过了什么? 我使用的版本是:
Python 3.4.3(默认,2016年11月17日,01:08:31)
熊猫0.21.0
目前没有回答
相关问题 更多 >
编程相关推荐