<p>您可以先用<a href="http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.date_range.html" rel="nofollow noreferrer">^{<cd2>}</a>创建<code>Series</code>,然后用<code>Series</code>和<a href="http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.join.html" rel="nofollow noreferrer">^{<cd4>}</a>中的值将索引交换到原始的、最后的聚合<code>sum</code>:</p>
<pre><code>s = pd.concat([pd.Series(r.Index,pd.date_range(r.date1, r.date2)) for r in df.itertuples()])
s = pd.Series(s.index, index=s, name='usetdate')
df = df.drop(['date1','date2'],axis=1).join(s).groupby(['id','usetdate'], as_index=False).sum()
print (df)
id usetdate score1 score2
0 1 2016-01-01 5 1
1 1 2016-01-02 15 4
2 1 2016-01-03 15 4
3 1 2016-01-04 10 6
4 1 2016-01-05 10 6
5 2 2016-01-01 3 7
6 2 2016-01-02 9 2
7 2 2016-01-03 17 9
8 2 2016-01-04 20 29
9 2 2016-01-05 3 20
</code></pre>
<p>编辑:</p>
<pre><code>L = [(i, d, s1, s2) for i, d1, d2, s1, s2 in df.values for d in pd.date_range(d1, d2)]
df = (pd.DataFrame(L, columns=['id','usetdate','score1','score2'])
.groupby(['id','usetdate'], as_index=False).sum())
print (df)
id usetdate score1 score2
0 1 2016-01-01 5 1
1 1 2016-01-02 15 4
2 1 2016-01-03 15 4
3 1 2016-01-04 10 6
4 1 2016-01-05 10 6
5 2 2016-01-01 3 7
6 2 2016-01-02 9 2
7 2 2016-01-03 17 9
8 2 2016-01-04 20 29
9 2 2016-01-05 3 20
</code></pre>
<p>编辑:</p>
<p>在聚合之前,可以使用left join <code>merge</code>值:</p>
<pre><code>df1['userdate'] = pd.to_datetime(df1['userdate'])
print (df1)
id userdate
0 1 2016-01-01
1 1 2016-01-03
2 2 2016-01-04
3 2 2016-01-02
L = [(i, d, s1, s2) for i, d1, d2, s1, s2 in df.values for d in pd.date_range(d1, d2)]
df = (pd.DataFrame(L, columns=['id','userdate','score1','score2'])
.merge(df1)
.groupby(['id','userdate'], as_index=False)
.sum())
print (df)
id userdate score1 score2
0 1 2016-01-01 5 1
1 1 2016-01-03 15 4
2 2 2016-01-02 9 2
3 2 2016-01-04 20 29
</code></pre>
<p>编辑1:</p>
<p>可以筛选列表中转换为元组的值:</p>
<pre><code>df1['userdate'] = pd.to_datetime(df1['userdate'])
print (df1)
id userdate
0 1 2016-01-01
1 1 2016-01-03
2 2 2016-01-04
3 2 2016-01-02
a = [tuple(x) for x in df1.values]
print (a)
[(1, Timestamp('2016-01-01 00:00:00')), (1, Timestamp('2016-01-03 00:00:00')),
(2, Timestamp('2016-01-04 00:00:00')), (2, Timestamp('2016-01-02 00:00:00'))]
L = [(i, d, s1, s2) for i, d1, d2, s1, s2 in df.values
for d in pd.date_range(d1, d2)
if (i, d) in a]
df = (pd.DataFrame(L, columns=['id','userdate','score1','score2'])
.groupby(['id','userdate'], as_index=False)
.sum())
print (df)
id userdate score1 score2
0 1 2016-01-01 5 1
1 1 2016-01-03 15 4
2 2 2016-01-02 9 2
3 2 2016-01-04 20 29
</code></pre>