擅长:python、mysql、java
<p>根据你想要什么,我会使用:</p>
<pre><code>import pandas as pd
threshold=7
cities = ['city1' for _ in range(10)] + ['city2' for _ in range(5)]
df = pd.DataFrame(cities, columns=['city'])
df['freq'] = df.groupby('city')['city'].transform('count')
df = df[df['freq']>threshold]
</code></pre>
<p>它保留了原始df中的所有行</p>
<pre><code>df = pd.DataFrame(df['city'].value_counts())
df = df[df['city']<threshold]
</code></pre>
<p>每个城市的名字只能给你一行</p>