<p>下面是完整的片段</p>
<pre><code>>>> import numpy as np
>>> x1 = np.array( [1,0,0] )
>>> x2 = np.array( [0,1,0] )
>>> x3 = np.array( [0,0,1] )
>>> total_timeslot = 200
>>> HomeG_Time = [93, 109, 187]
>>> AwayG_Time = [90, 177]
>>> ExpG_Home=2.2
>>> ExpG_Away=1.8
>>> y = np.array( [1 - (ExpG_Home + ExpG_Away), ExpG_Home, ExpG_Away] )
>>> def squared_diff(x1, x2, x3, y):
... ssd = []
... for k in range(total_timeslot):
... if k in HomeG_Time:
... ssd.append(sum((x2 - y) ** 2))
... elif k in AwayG_Time:
... ssd.append(sum((x3 - y) ** 2))
... else:
... ssd.append(sum((x1 - y) ** 2))
... return ssd
...
>>> sum(squared_diff(x1, x2, x3, y))
4765.599999999989
</code></pre>
<p>假设是这样。使用<a href="https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.apply.html" rel="nofollow noreferrer">pandas.DataFrame.apply</a>计算y作为(N,3)</p>
<pre><code>>>> y = np.array( df.apply(lambda row: [1 - (row.ExpG_Home + row.ExpG_Away),
... row.ExpG_Home, row.ExpG_Away ],
... axis=1).tolist() )
>>> y.shape
(5, 3)
</code></pre>
<p>现在计算给定x的平方误差</p>
<pre><code>>>> def squared_diff(x, y):
... return np.sum( np.square(x - y), axis=1)
</code></pre>
<p>在您的例子中,如果<code>error2</code>是<code>squared_diff(x2,y)</code>,那么您要添加<code>HomeG_Time</code>的发生次数</p>
<pre><code>>>> n3 = df.AwayG_Time.apply(len)
>>> n2 = df.HomeG_Time.apply(len)
>>> n1 = 200 - (n2 + n3)
</code></pre>
<p>最后的误差平方和是(根据你的计算)</p>
<pre><code>>>> squared_diff(x1, y) * n1 + squared_diff(x2, y) * n2 + squared_diff(x3, y) * n3
0 4766.4
1 2349.4
2 2354.4
3 6411.6
4 4496.2
dtype: float64
>>>
</code></pre>