<p>我明白了,这是我的<em>真实</em>代码(所以我不会发布所有内容)。这将起作用(但不确定是否以<strong>最快的方式实施)</p>
<p>我使用的是<strong><em>数据帧.apply</em></strong>。这不是矢量化的方式,但应该比python中的循环快得多。有人能告诉我们如何用矢量化的方式重写下面的内容吗</p>
<p>参考本文-<a href="https://engineering.upside.com/a-beginners-guide-to-optimizing-pandas-code-for-speed-c09ef2c6a4d6" rel="nofollow noreferrer">https://engineering.upside.com/a-beginners-guide-to-optimizing-pandas-code-for-speed-c09ef2c6a4d6</a></p>
<p>。。。我无法用<strong>矢量化</strong>的方式来重写,鉴于查找的性质,我开始觉得下面的内容不能<strong>矢量化</strong>(如果你们中有人能证明我错了,我很高兴):</p>
<pre><code>pdPnl = pd.DataFrame.from_records([ObjectUtil.objectPropertiesToDictionary(pnl) for pnl in profitLosses], columns=ObjectUtil.objectPropertiesToDictionary(profitLosses[0]).keys())
pdPnl["TM1"] = pdPnl.apply(lambda rw : rw["COB"] - timedelta(days=1) , axis=1)
pdPnl["MonthStart"] = pdPnl.apply(lambda rw : rw["COB"].replace(day=1), axis=1)
pdPnl["QuarterStart"] = pdPnl.apply(lambda rw : DateTimeUtil.getQuarterStart(rw["COB"], rw["COB"].year), axis=1)
pdPnl["YearStart"] = pdPnl.apply(lambda rw : datetime(rw["COB"].year, 1, 1), axis=1)
pdPnl["DTDRealizedPnl"] = pdPnl.apply(lambda rw : PnlCalculatorBase.computeField(pdPnl, rw["TM1"], rw["InceptionRealizedPnl"], "InceptionRealizedPnl"), axis=1)
pdPnl["DTDUnrealizedPnl"] = pdPnl.apply(lambda rw : PnlCalculatorBase.computeField(pdPnl, rw["TM1"], rw["InceptionUnrealizedPnl"], "InceptionUnrealizedPnl"), axis=1)
pdPnl["MTDRealizedPnl"] = pdPnl.apply(lambda rw : PnlCalculatorBase.computeField(pdPnl, rw["MonthStart"], rw["InceptionRealizedPnl"], "InceptionRealizedPnl"), axis=1)
pdPnl["MTDUnrealizedPnl"] = pdPnl.apply(lambda rw : PnlCalculatorBase.computeField(pdPnl, rw["MonthStart"], rw["InceptionUnrealizedPnl"], "InceptionUnrealizedPnl"), axis=1)
pdPnl["YTDRealizedPnl"] = pdPnl.apply(lambda rw : PnlCalculatorBase.computeField(pdPnl, rw["YearStart"], rw["InceptionRealizedPnl"], "InceptionRealizedPnl"), axis=1)
pdPnl["YTDUnrealizedPnl"] = pdPnl.apply(lambda rw : PnlCalculatorBase.computeField(pdPnl, rw["YearStart"], rw["InceptionUnrealizedPnl"], "InceptionUnrealizedPnl"), axis=1)
pdPnl["SharpeRatio"] = pdPnl.apply(lambda rw : PnlCalculatorBase.computeSharpeRatio(pdPnl, rw["COB"]), axis=1)
pdPnl["MaxDrawDown"] = pdPnl.apply(lambda rw : PnlCalculatorBase.computeMaxDrawDown(pdPnl, rw["COB"]), axis=1)
pnlDict = pdPnl.to_dict() # Then convert back to List of ProfitLoss (Slow...)
</code></pre>
<p>查找功能包括:</p>
<pre><code>@staticmethod
def lookUpRow(pdPnl, cob):
return pdPnl[pdPnl["COB"]==cob]
@staticmethod
def computeField(pdPnl, cob, todaysPnl, targetField):
val = np.nan
otherRow = PnlCalculatorBase.lookUpRow(pdPnl, cob)
if otherRow is not None and otherRow[targetField].shape[0]>0:
try:
tm1InceptionRealizedPnl = otherRow[targetField].iloc[0]
val = todaysPnl - tm1InceptionRealizedPnl
except:
# slow...
errMsg = "Failed lookup for " + str(cob) + " " + targetField
logging.error(errMsg)
val = np.nan
return val
@staticmethod
def computeSharpeRatio(pdPnl, cob):
val = None
pdPnl = pdPnl[(pdPnl['COB']<=cob)]
pdPnl = pdPnl.loc[:,["COB", "DTDRealizedPnl","DTDUnrealizedPnl"]]
pdPnl["TotalDTD"] = pdPnl.apply(lambda rw : rw["DTDRealizedPnl"] + rw["DTDUnrealizedPnl"], axis=1)
# @todo, We don't have risk free rate for Sharpe Ration calc. Here's just total DTD avg return over standard deviation
# https://en.wikipedia.org/wiki/Sharpe_ratio
mean = pdPnl["TotalDTD"].mean()
std = pdPnl["TotalDTD"].std()
val = mean / std
return val
@staticmethod
def computeMaxDrawDown(pdPnl, cob):
val = None
pdPnl = pdPnl[(pdPnl['COB']<=cob) & (pdPnl["DTDRealizedPnl"]<0)]
val = pdPnl["DTDRealizedPnl"].min()
return val
</code></pre>