回答此问题可获得 20 贡献值,回答如果被采纳可获得 50 分。
<p>我有一个看起来像</p>
<pre><code> Category Start End
0 a 2014-12-01 2015-06-01
1 a 2015-10-02 2015-10-16
2 b 2015-10-01 2016-04-01
3 b 2015-10-01 2015-12-01
4 c 2015-06-01 2015-08-01
</code></pre>
<p>对于日期范围d中的每个日期,我想查找开始<;=日期<;=结束,然后我想数一数有多少不同的类别</p>
<p>最有效的方法是什么</p>
<pre><code>import pandas as pd
import datetime
d = pd.date_range(start='2015-01-01', end='2015-12-31', freq='D')
s = {'Start':[datetime.date(2014,12,1), datetime.date(2015,10,2), datetime.date(2015,10,1), datetime.date(2015,10,1), datetime.date(2015,6,1)]}
e = {'End':[datetime.date(2015,6,1), datetime.date(2015,10,16), datetime.date(2016,4,1), datetime.date(2015,12,1), datetime.date(2015,8,1)]}
c = {'Category': ['a', 'a', 'b', 'b', 'c']}
c.update(s)
c.update(e)
df = pd.DataFrame(c)
df_count = pd.DataFrame(index=d, col['count']
for date in d:
count_occourances = len(set(df.loc[(df['Start'] <= date) & (df['End'] >= date), 'Category']))
# Some saving to keep track on count for this particular date e.g.
df_count.loc[date, 'count'] = count_occourances
</code></pre>
<p>然后,预期的输出将
df_计数:</p>
<pre><code> Category Count
2015-01-01 1
2015-01-02 1
2015-01-03 1
2015-01-04 1
2015-01-05 1
.
.
.
2015-05-31 1
2015-06-01 2
2015-06-02 1
2015-06-03 1
.
.
.
2015-07-31 1
2015-08-01 1
2015-08-02 0
.
.
.
2015-09-30 0
2015-10-01 2
2015-10-02 3
2015-10-03 3
.
.
.
2015-10-15 3
2015-10-16 3
2015-10-17 2
.
.
.
2015-12-01 2
2015-12-02 1
.
.
.
2015-12-31 1
</code></pre>