<p>将<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.eq.html" rel="nofollow noreferrer">^{<cd1>}</a>(<code>==</code>)用于比较列<code>string</code>,而聚集{<cd4>}用于计数<code>True</code>值,因为<code>True</code>是类似于<code>1</code>s的进程:</p>
<pre><code>#convert to datetimes if necessary
inputdf['timestamp'] = pd.to_datetime(inputdf['timestamp'], format='%m/%d/%y')
print (inputdf)
timestamp IceCreamOrder Location
0 2018-01-02 Chocolate South
1 2018-01-03 Vanilla North
2 2018-01-03 Strawberry North
3 2018-01-03 Strawberry North
4 2018-01-04 Strawberry North
5 2018-01-04 Rasberry North
6 2018-01-04 Vanilla North
7 2018-01-05 Chocolate North
mydf = (inputdf.set_index('timestamp')['IceCreamOrder']
.eq('Strawberry')
.groupby(pd.Grouper(freq = 'D'))
.sum())
print (mydf)
timestamp
2018-01-02 0.0
2018-01-03 2.0
2018-01-04 1.0
2018-01-05 0.0
Freq: D, Name: IceCreamOrder, dtype: float64
</code></pre>
<p>如果要计算所有<code>type</code>s,请将列<code>IceCreamOrder</code>添加到<code>groupby</code>和聚合<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.core.groupby.GroupBy.size.html" rel="nofollow noreferrer">^{<cd11>}</a>:</p>
^{pr2}$
<hr/>
<pre><code>mydf1 = (inputdf.set_index('timestamp')
.groupby([pd.Grouper(freq = 'D'),'IceCreamOrder'])
.size()
.unstack(fill_value=0))
print (mydf1)
IceCreamOrder Chocolate Rasberry Strawberry Vanilla
timestamp
2018-01-02 1 0 0 0
2018-01-03 0 0 2 1
2018-01-04 0 1 1 1
2018-01-05 1 0 0 0
</code></pre>
<p>如果所有<code>datetime</code>都没有<code>time</code>s:</p>
<pre><code>mydf1 = (inputdf.groupby(['timestamp', 'IceCreamOrder'])
.size()
.unstack(fill_value=0))
print (mydf1)
IceCreamOrder Chocolate Rasberry Strawberry Vanilla
timestamp
2018-01-02 1 0 0 0
2018-01-03 0 0 2 1
2018-01-04 0 1 1 1
2018-01-05 1 0 0 0
</code></pre>