擅长:python、mysql、java
<h3><code>drop_duplicates</code>与<code>groupby</code>+<code>count</code></h3>
<pre><code>(df.drop_duplicates()
.groupby('Site_Where_Served')
.Site_Where_Served.count()
.reset_index(name='Site_Visit_Count')
)
Site_Where_Served Site_Visit_Count
0 hospital 3
1 inpatient 1
</code></pre>
<p>注意,<code>count</code>/<code>size</code>之间的一个微小区别是前者不计算NaN条目。你知道吗</p>
<hr/>
<h3>元组化,<code>groupby</code>和<code>nunique</code></h3>
<p>这实际上只是修复您当前的解决方案,但我不建议这样做,因为这是相当冗长的步骤比必要的多。首先,对列进行tuplize,按<code>Site_Where_Served</code>分组,然后计数:</p>
<pre><code>(df[['Person', 'Service_Date']]
.apply(tuple, 1)
.groupby(df.Site_Where_Served)
.nunique()
.reset_index(name='Site_Visit_Count')
)
Site_Where_Served Site_Visit_Count
0 hospital 3
1 inpatient 1
</code></pre>