选择不同月份的同一笔交易

2024-10-02 14:22:27 发布

您现在位置:Python中文网/ 问答频道 /正文

我想选择或保持相同的事务描述(TRNDESCR),使用Pandas dataframe至少只发生3个月。我尝试了一些代码,但没有按预期工作。你知道吗

以下是示例数据集:

    ACNO TIME                   TRNCD   TRNDESCR                                        TRNAMT
0   85   2018-12-19 20:40:00    109     Ib Transfer To Phoutthalom Syh Account No:123   -20000
1   85   2018-12-19 21:15:00    109     Ib Transfer To Phoutthalom Syh Account No:123   -25000
2   85   2018-12-20 15:30:00    109     Ib Transfer To Thongsavath Pra Account No:124   -10000
3   85   2018-12-22 12:30:00    209     Bil Payment                                     -500
4   85   2018-12-25 15:34:00    109     Ib Transfer To Phoutthalom Syh Account No:123   -60000
5   85   2019-01-22 12:30:00    209     Bil Payment                                     -501
6   85   2019-01-23 12:50:00    109     Ib Transfer To Sarah Account No:199             -3000
7   85   2019-01-31 08:59:00    109     Ib Transfer To Thongsavath Pra Account No:124   -650000
8   85   2019-02-02 12:30:00    109     Ib Transfer To Sarah Account No:199             -600
9   85   2019-02-03 15:02:00    109     Ib Transfer To Phoutthalom Syh Account No:123   -60000
10  85   2019-02-04 15:21:00    109     Ib Transfer To Thongsavath Pra Account No:124   -863000
11  85   2019-02-05 15:30:00    209     Bil Payment                                     -600

预期结果如下:

    ACNO TIME                   TRNCD   TRNDESCR                                        TRNAMT
0   85   2018-12-20 15:30:00    109     Ib Transfer To Thongsavath Pra Account No:124   -10000
1   85   2018-12-22 12:30:00    209     Bil Payment                                     -500
2   85   2019-01-22 12:30:00    209     Bil Payment                                     -501
3   85   2019-01-31 08:59:00    109     Ib Transfer To Thongsavath Pra Account No:124   -650000
4   85   2019-02-04 15:21:00    109     Ib Transfer To Thongsavath Pra Account No:124   -863000
5   85   2019-02-05 15:30:00    209     Bil Payment                                     -600

Tags: tonotimeaccountpaymenttransferibbil
2条回答

这是我的解决办法


import pandas as pd

df = pd.read_excel("df_85.xlsx")

df_copy = df.copy()

# introduce new column
time = pd.DatetimeIndex(df_copy.TIME)
df_copy['yearmonth'] = time.year.astype(str) + time.month.astype(str)

# find month occurences within each TRNDESCR group
new_df = df_copy.groupby(['TRNDESCR']).yearmonth.nunique().to_frame().reset_index()
new_df = new_df[new_df.yearmonth >= 3]

# get row with TRNDESCR matches those in new_df
output_df = df[df.TRNDESCR.isin(new_df.TRNDESCR.values)]

print(output_df)

输出

    ACNO  YEAR  MONTH                TIME  TRNCD                                       TRNDESCR  TRNAMT
2     85  2018     12 2018-12-20 15:30:00    109  Ib Transfer To Thongsavath Pra Account No:124  -10000
3     85  2018     12 2018-12-22 12:30:00    209                                   Bil Payment     -500
5     85  2018      1 2019-01-22 12:30:00    209                                   Bil Payment     -501
7     85  2019      1 2019-01-31 08:59:00    109  Ib Transfer To Thongsavath Pra Account No:124 -650000
10    85  2019      2 2019-02-04 15:21:00    109  Ib Transfer To Thongsavath Pra Account No:124 -863000
11    85  2019      2 2019-02-05 15:30:00    209                                   Bil Payment     -600

它的工作原理是创建一个新列“yearmonth”(它是年+月的串联)。然后它执行groupby来trndescrc并计算每个组的唯一yearmonth数。你知道吗

选择被认为是一个指标的列,就像你给出的例子一样,它是TRNDESCR,也可以是你想要的时间,把“月”作为过滤器。然后删除duplicate并按TRNDESCR分组,然后根据月份计算事务发生的次数。你知道吗

示例:

import pandas as pd

df = pd.DataFrame()
df['TIME'] = ["2018-12-19", "2018-12-20", "2019-01-20", "2019-02-06",
             "2018-12-18", "2018-12-02", "2019-01-03", "2019-02-06"]
df['TRNDESCR'] = ["ib1", "ib2", "ib2", "ib2",
                 "ib2", "ib3", "ib3", "ib3"]
df['ACNO'] = 85


df['TIME'] = pd.to_datetime(df['TIME'])
df['MONTH'] = df['TIME'].dt.month

count_month = df[['MONTH', 'TRNDESCR']].drop_duplicates(['MONTH', 'TRNDESCR'], keep="last").groupby('TRNDESCR')['MONTH'].count()

df[df['TRNDESCR'].isin(count_month[count_month >= 3].index)]

TIME    TRNDESCR    ACNO    MONTH
1   2018-12-20  ib2     85  12
2   2019-01-20  ib2     85  1
3   2019-02-06  ib2     85  2
4   2018-12-18  ib2     85  12
5   2018-12-02  ib3     85  12
6   2019-01-03  ib3     85  1
7   2019-02-06  ib3     85  2

相关问题 更多 >