<p>关于<code>start</code>和<code>end</code>之间的填充值,可以根据<a href="https://stackoverflow.com/questions/45118710/fill-in-values-between-given-indices-of-2d-numpy-array">this answer</a>按如下方式进行:</p>
<p>数据:</p>
<p><code>df = pd.DataFrame([[0,0],[0,0],[0,0],[1,0],[0,0],[0,1],[0,0],[0,0],[1,0],[0,1],[0,0],[0,0],[0,0],[0,0],[1,0],[0,0],[0,0],[0,1],[0,0],[0,0],[0,0],],columns=['start','end'])</code></p>
<pre><code> start end
0 0 0
1 0 0
2 0 0
3 1 0
4 0 0
5 0 1
6 0 0
7 0 0
8 1 0
9 0 1
10 0 0
</code></pre>
<p>取<code>start</code>和<code>end</code>的索引:</p>
<pre><code>s = df.start.nonzero()[0]
e = df.end.nonzero()[0]
>>> s, e
(array([3, 8], dtype=int64), array([5, 9], dtype=int64))
</code></pre>
<p>重塑原始索引:</p>
<pre><code>>>> index = df.index.values.reshape(-1,1)
array([[ 0],
[ 1],
[ 2],
[ 3],
[ 4],
[ 5],
[ 6],
[ 7],
[ 8],
[ 9],
[10]], dtype=int64)
</code></pre>
<p>然后我们可以利用numpy的<a href="https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html" rel="nofollow noreferrer">broadcasting</a>:</p>
<pre><code>>>> index < [1] >>> index < [1,2,3,4,5]
array([[ True], array([[ True, True, True, True, True],
[False], [False, True, True, True, True],
[False], [False, False, True, True, True],
[False], [False, False, False, True, True],
[False], [False, False, False, False, True],
[False], [False, False, False, False, False],
[False], [False, False, False, False, False],
[False], [False, False, False, False, False],
[False], [False, False, False, False, False],
[False], [False, False, False, False, False],
[False]]) [False, False, False, False, False]])
</code></pre>
<p>对于每个<code>start</code>-<code>end</code>对,生成一个条件:</p>
<pre><code>>>> ((s <= index) & (index <= e))
array([[False, False],
[False, False],
[False, False],
[ True, False],
[ True, False],
[ True, False],
[False, False],
[False, False],
[False, True],
[False, True],
[False, False]])
</code></pre>
<p>然后使用<code>sum</code>:</p>
<pre><code> df['Expected Flag'] = ((s <= index) & (index <= e)).sum(axis=1)
start end Expected Flag
0 0 0 0
1 0 0 0
2 0 0 0
3 1 0 1
4 0 0 1
5 0 1 1
6 0 0 0
7 0 0 0
8 1 0 1
9 0 1 1
10 0 0 0
</code></pre>
<p>一行:
<code>((df.start.nonzero()[0] <= df.index.values.reshape(-1,1)) & (df.index.values.reshape(-1,1) <= df.end.nonzero()[0])).sum(axis=1)</code></p>