擅长:python、mysql、java
<p>首先按列<code>successpizza</code>中的<code>True</code>筛选所有行,然后按列<code>sum</code>筛选<code>houroftheday</code>:</p>
<pre><code>sum_hour = data.loc[data['successpizza'] == 'true', 'houroftheday'].sum()
print (sum_hour)
102
</code></pre>
<p>如果需要<code>size</code>只需要计数<code>True</code>,如果使用<code>sum</code>,则<code>True</code>是类似<code>1</code>的进程:</p>
<pre><code>len_hour = (data['successpizza'] == 'true').sum()
print (len_hour)
8
</code></pre>
<p>或者如果需要每个<code>houroftheday</code>的长度:</p>
<pre><code>mask = (data['successpizza'] == 'true').astype(int)
out = mask.groupby(data['houroftheday']).sum()
print (out)
houroftheday
1 1
2 2
3 0
12 0
14 1
18 1
20 1
21 0
22 1
23 1
Name: successpizza, dtype: int32
</code></pre>
<p>删除跟踪空白的解决方案是<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.str.strip.html" rel="nofollow noreferrer">^{<cd11>}</a>:</p>
<pre><code>line = "requester_received_pizza"
lines = pizzarequests[pizzarequests.str.contains(line)].str.split(",").str[1].str.strip()
</code></pre>