Python在datafram中运行for循环的更快方法

2024-10-02 02:43:13 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在运行以下代码,为每个dataframe行计算前几行中的正天数以及该股票击败标准普尔500指数的天数:

        for offset in [1,5,15,30,45,60,75,90,120,150,
                       200,250,500,750,1000,1250,1500]:
            asset['return_stock'] = (asset.Close - asset.Close.shift(1)) / (asset.Close.shift(1))
            merged_data = pd.merge(asset, sp_500, on='Date')
            total_positive_days=0
            total_beating_sp_days=0
            for index, row in merged_data.iterrows():
                print(offset, index)
                for i in range(0,offset):
                    if index-i-1>0:
                        if merged_data.loc[index-i,'Close_x'] > merged_data.loc[index-i-1,'Close_x']:
                            total_positive_days+=1
                        if merged_data.loc[index-i,'return_stock'] > merged_data.loc[index-i-1,'return_sp']:
                            total_beating_sp_days+=1

但是速度很慢。有没有办法加快它的速度(可能是通过某种方式摆脱for循环)?你知道吗

我的数据集如下所示(合并的数据如下所示):

Date     Open_x     High_x      Low_x    Close_x  Adj Close_x   Volume_x  return_stock  Pct_positive_1  Pct_beating_1  Pct_change_1  Pct_change_plus_1  Pct_positive_5  Pct_beating_5  Pct_change_5  Pct_change_plus_5  Pct_positive_15  Pct_beating_15  Pct_change_15  Pct_change_plus_15  Pct_positive_30  Pct_beating_30  Pct_change_30  Pct_change_plus_30       Open_y       High_y        Low_y      Close_y  Adj Close_y    Volume_y  return_sp
0  2010-01-04  30.490000  30.642857  30.340000  30.572857    26.601469  123432400           NaN          1311.0         1261.0           NaN          -0.001726          1310.4         1260.8           NaN           0.018562           1307.2          1257.6            NaN            0.039186      1302.066667     1252.633333            NaN            0.056579  1116.560059  1133.869995  1116.560059  1132.989990  1132.989990  3991400000   0.016043
1  2010-01-05  30.657143  30.798571  30.464285  30.625713    26.647457  150476200      0.001729          1311.0         1261.0      0.001729           0.016163          1310.4         1260.8           NaN           0.032062           1307.2          1257.6            NaN            0.031268      1302.066667     1252.633333            NaN            0.056423  1132.660034  1136.630005  1129.660034  1136.520020  1136.520020  2491020000   0.003116
2  2010-01-06  30.625713  30.747143  30.107143  30.138571    26.223597  138040000     -0.015906          1311.0         1261.0     -0.015906           0.001852          1310.4         1260.8           NaN           0.001519           1307.2          1257.6            NaN            0.058608      1302.066667     1252.633333            NaN            0.046115  1135.709961  1139.189941  1133.949951  1137.140015  1137.140015  4972660000   0.000546
3  2010-01-07  30.250000  30.285715  29.864286  30.082857    26.175119  119282800     -0.001849          1311.0         1261.0     -0.001849          -0.006604          1310.4         1260.8           NaN           0.005491           1307.2          1257.6            NaN            0.096428      1302.066667     1252.633333            NaN            0.050694  1136.270020  1142.459961  1131.319946  1141.689941  1141.689941  5270680000   0.004001
4  2010-01-08  30.042856  30.285715  29.865715  30.282858    26.349140  111902700      0.006648          1311.0         1261.0      0.006648           0.008900          1310.4         1260.8           NaN           0.029379           1307.2          1257.6            NaN            0.088584      1302.066667     1252.633333            NaN            0.075713  1140.520020  1145.390015  1136.219971  1144.979980  1144.979980  4389590000   0.002882

资产如下:

         Date       Open       High        Low      Close  Adj Close     Volume  return_stock  Pct_positive_1  Pct_beating_1  Pct_change_1  Pct_change_plus_1  Pct_positive_5  Pct_beating_5  Pct_change_5  Pct_change_plus_5
0  2010-01-04  30.490000  30.642857  30.340000  30.572857  26.601469  123432400           NaN          1311.0         1261.0           NaN          -0.001726          1310.4         1260.8           NaN           0.018562
1  2010-01-05  30.657143  30.798571  30.464285  30.625713  26.647457  150476200      0.001729          1311.0         1261.0      0.001729           0.016163          1310.4         1260.8           NaN           0.032062
2  2010-01-06  30.625713  30.747143  30.107143  30.138571  26.223597  138040000     -0.015906          1311.0         1261.0     -0.015906           0.001852          1310.4         1260.8           NaN           0.001519
3  2010-01-07  30.250000  30.285715  29.864286  30.082857  26.175119  119282800     -0.001849          1311.0         1261.0     -0.001849          -0.006604          1310.4         1260.8           NaN           0.005491
4  2010-01-08  30.042856  30.285715  29.865715  30.282858  26.349140  111902700      0.006648          1311.0         1261.0      0.006648           0.008900          1310.4         1260.8           NaN           0.029379

标准普尔500指数如下:

         Date         Open         High          Low        Close    Adj Close      Volume  return_sp
0  1999-12-31  1464.469971  1472.420044  1458.189941  1469.250000  1469.250000   374050000        NaN
1  2000-01-03  1469.250000  1478.000000  1438.359985  1455.219971  1455.219971   931800000  -0.009549
2  2000-01-04  1455.219971  1455.219971  1397.430054  1399.420044  1399.420044  1009000000  -0.038345
3  2000-01-05  1399.420044  1413.270020  1377.680054  1402.109985  1402.109985  1085500000   0.001922
4  2000-01-06  1402.109985  1411.900024  1392.099976  1403.449951  1403.449951  1092300000   0.000956

Tags: forclosedataindexreturnstockplusmerged
1条回答
网友
1楼 · 发布于 2024-10-02 02:43:13

这是部分答案。你知道吗

我觉得你的方式

asset.Close - asset.Close.shift(1)

最重要的是你如何做到这一点。而不是

if merged_data.loc[index-i,'Close_x'] > merged_data.loc[index-i-1,'Close_x']

创建一个列,其中Close\u x中有更改:

merged_data['Delta_Close_x'] = merged_data.Close_x - merged_data.Close_x.shift(1)

同样地

if merged_data.loc[index-i,'return_stock'] > merged_data.loc[index-i-1,'return_sp']

变成

merged_data['vs_sp'] = merged_data.return_stock - merged_data.return_sp.shift(1)

然后你可以迭代i并使用像这样的子集

merged_data[merged_data['Delta_Close_x'] > 0 and merged_data['vs_sp'] > 0]

有很多额外的细节要解决,但我希望这能让你开始。你知道吗

相关问题 更多 >

    热门问题