擅长:python、mysql、java
<p>选择被认为是一个指标的列,就像你给出的例子一样,它是TRNDESCR,也可以是你想要的时间,把“月”作为过滤器。然后删除duplicate并按TRNDESCR分组,然后根据月份计算事务发生的次数。你知道吗</p>
<p>示例:</p>
<pre class="lang-py prettyprint-override"><code>import pandas as pd
df = pd.DataFrame()
df['TIME'] = ["2018-12-19", "2018-12-20", "2019-01-20", "2019-02-06",
"2018-12-18", "2018-12-02", "2019-01-03", "2019-02-06"]
df['TRNDESCR'] = ["ib1", "ib2", "ib2", "ib2",
"ib2", "ib3", "ib3", "ib3"]
df['ACNO'] = 85
df['TIME'] = pd.to_datetime(df['TIME'])
df['MONTH'] = df['TIME'].dt.month
count_month = df[['MONTH', 'TRNDESCR']].drop_duplicates(['MONTH', 'TRNDESCR'], keep="last").groupby('TRNDESCR')['MONTH'].count()
df[df['TRNDESCR'].isin(count_month[count_month >= 3].index)]
</code></pre>
<pre><code>TIME TRNDESCR ACNO MONTH
1 2018-12-20 ib2 85 12
2 2019-01-20 ib2 85 1
3 2019-02-06 ib2 85 2
4 2018-12-18 ib2 85 12
5 2018-12-02 ib3 85 12
6 2019-01-03 ib3 85 1
7 2019-02-06 ib3 85 2
</code></pre>