我的代码如下:
import numpy as np
import pandas as pd
colum1 = [0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.05]
colum2 = [1,2,3,4,5,6,7,8,9,10,11,12]
colum3 = [0.85,0.80,0.80,0.80,0.85,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
colum4 = [1743.85, 1485.58, 1250.07, 1021.83, 818.96, 628.05, 455.40, 319.03, 190.86 , 97.07, 26.96 , 0.00]
df = pd.DataFrame({
'colum1' : colum1,
'colum2' : colum2,
'colum3' : colum3,
'colum4' : colum4,
});
df['result'] = 0
for i in range(len(colum2)):
df['result'] = np.where(
df['colum2'] <= 5,
np.where(
df['colum2'] == 1,
df['colum4'],
np.where(
( df['colum4'] - (df['result'].shift(1) * (df['colum1'] * df['colum3'])) )>0,
( df['colum4'] - (df['result'].shift(1) * (df['colum1'] * df['colum3'])) ),
0
)
),
np.where(
( df['colum4'] - (df['result'].shift(1) * df['colum1']) )>0,
( df['colum4'] - (df['result'].shift(1) * df['colum1']) ),
0
)
)
我需要执行相同的操作,而不必求助于for循环。 这将是非常有帮助的,因为我与成千上万的记录,这是非常缓慢的工作。你知道吗
我的预期结果如下:
colum1 colum2 colum3 colum4 result 0 0.05 1 0.85 1743.85 1743.850000 1 0.05 2 0.80 1485.58 1415.826000 2 0.05 3 0.80 1250.07 1193.436960 3 0.05 4 0.80 1021.83 974.092522 4 0.05 5 0.85 818.96 777.561068 5 0.05 6 0.00 628.05 589.171947 6 0.05 7 0.00 455.40 425.941403 7 0.05 8 0.00 319.03 297.732930 8 0.05 9 0.00 190.86 175.973354 9 0.05 10 0.00 97.07 88.271332 10 0.05 11 0.00 26.96 22.546433 11 0.05 12 0.00 0.00 0.000000
第一步是删除索引上的循环,并用^{} 替换大于0的数字的测试。这是因为
np.where(a > 0, a, 0)
就我们的目的而言等同于np.maximum(0, a)
。你知道吗同时,分别定义较长的表达式以使代码可读:
下一步是使用^{} 删除嵌套的
np.where
语句:这个版本将更易于管理。你知道吗
相关问题 更多 >
编程相关推荐