擅长:python、mysql、java
<p>另一种方式不如@sacul优雅…但速度几乎一样。在</p>
<pre><code>import pandas as pd
x = pd.DataFrame({'user': ['a','a','b','b','a'],
'dt': ['2016-01-01','2016-01-02',
'2016-01-05','2016-01-06','2016-01-06'],
'val': [1,33,2,1,2]})
users = pd.unique(x.user)
x.dt = pd.to_datetime(x.dt)
dates = pd.date_range(min(x.dt), max(x.dt))
x.set_index('dt', inplace=True)
df = pd.DataFrame(index=dates)
for u in users:
df[u] = x[x.user==u].val
df = df.unstack().reset_index()
df.rename(columns={'level_0': 'user',
'level_1': 'dt',
0: 'val'}, inplace=True)
df.val.fillna(0, inplace=True)
df.val = df.val.astype(int)
df = df[['dt', 'user', 'val']]
</code></pre>
<p>数据框:</p>
^{pr2}$