如何在多索引数据帧中添加和删除单索引数据帧行?

2024-09-28 16:59:57 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个df,Stock_to_buy,它告诉我们在给定的日期买哪只股票。索引列是Date

             Symbol   Shares
Date         
2018-01-01   AAOI     20
2018-01-03   FB       34
2018-01-05   AMZN     5
2018-01-07   SQ       25
2018-01-08   TPL      31

我创建了另一个df,Portfolio。我猜解决方案要求它是多索引的,所以索引列是DateSymbol

                       Shares
Date         Symbol
2018-01-01   NaN       NaN
2018-01-02   NaN       NaN
2018-01-03   NaN       NaN
2018-01-04   NaN       NaN
2018-01-05   NaN       NaN
2018-01-06   NaN       NaN
2018-01-07   NaN       NaN
2018-01-08   NaN       NaN

我想在Stock_to_buy的基础上更新Portfolio:如果一个符号在某个给定的日期在Stock_to_buy中,它应该作为一行添加到该日期的Portfolio,并且在随后的日期它应该保持在那里,直到Sell_Next_Day变成True,在这种情况下Symbol的行应该从随后的日期中删除。下面是一个示例,我希望Portfolio如何工作

                      Shares   Sell_Next_Day
Date         Symbol
2017-12-31   NaN      NaN      NaN 
---------------------------------------------        
2018-01-01   AAOI     20       False
---------------------------------------------
2018-01-02   AAOI     20       False
---------------------------------------------
2018-01-03   AAOI     20       False
             FB       34       False
---------------------------------------------
2018-01-04   AAOI     20       False
             FB       34       False
---------------------------------------------
2018-01-05   AAOI     20       False
             FB       34       False
             AMZN     5        False
---------------------------------------------
2018-01-06   AAOI     20       True           # AAOI will be sold next day
             FB       34       False
             AMZN     5        False
---------------------------------------------
2018-01-07   FB       34       False
             AMZN     5        True           # AMZN will be sold next day
             SQ       25       False
---------------------------------------------
2018-01-08   FB       34       False
             SQ       25       False
             TPL      31       False

我该怎么做?感谢您的帮助


Tags: tofalsetruedfdatefbstocksq
1条回答
网友
1楼 · 发布于 2024-09-28 16:59:57

我不确定它是否正是你想要的,但我认为它会有所帮助。 数据:

Stock_to_buy = pd.DataFrame(data=[
                                  ['2018-01-01','AAOI','20'],
                                  ['2018-01-03', 'FB', 34],
                                  ['2018-01-05', 'AMZN' , 5],
                                  ['2018-01-07', 'SQ', 25],
                                  ['2018-01-08', 'TPL' ,31]
                            ], columns=['Date', 'Symbol','Shares']
                           )
Stock_to_buy.Date = pd.to_datetime(Stock_to_buy.Date)

步骤1。填充所有可能的值

END_DATE = Stock_to_buy.sort_values('Date').Date.values[-1]
def fill(df):
    index = pd.date_range(start=df.Date.values[0], end=END_DATE)
    return pd.DataFrame(data=[[1]*len(index)], index=index)

Portfolio = Stock_to_buy.groupby(['Symbol', 'Shares']
                    ).apply(fill).reset_index(
                    ).drop([i for i in range(0,8)], 1).rename(
                    columns={'level_2': 'Date'}).set_index('Date')

输出:

           Symbol Shares
Date                    
2018-01-01   AAOI     20
2018-01-02   AAOI     20
2018-01-03   AAOI     20
2018-01-04   AAOI     20
2018-01-05   AAOI     20
2018-01-06   AAOI     20
2018-01-07   AAOI     20
2018-01-08   AAOI     20
2018-01-05   AMZN      5
2018-01-06   AMZN      5
2018-01-07   AMZN      5
2018-01-08   AMZN      5
2018-01-03     FB     34
2018-01-04     FB     34
2018-01-05     FB     34
2018-01-06     FB     34
2018-01-07     FB     34
2018-01-08     FB     34
2018-01-07     SQ     25
2018-01-08     SQ     25
2018-01-08    TPL     31

第二步。如果我们在第二天有专栏文章,比如:

Sell_Next_Day = pd.Series([False, False, False, False, False, True, False, 
                           False, False, False, True, False, False, False, False,
                           False,False,False, False, False, False],
                           name='Sell_Next_Day')

然后我们可以加入它们并在SymbolSell_Next_Day中有真值之后删除所有下一个raw

def drop_dates(df):
    if df.Sell_Next_Day.any():
        res = df[df.Date <= df[df.Sell_Next_Day == True].Date.values[0]]
    else:
        res = df
    return res

Portfolio = Portfolio.reset_index().join(Sell_Next_Day
                            ).groupby('Symbol'
                            ).apply(drop_dates).set_index('Date')

输出:

           Symbol Shares  Sell_Next_Day
Date                                   
2018-01-01   AAOI     20          False
2018-01-02   AAOI     20          False
2018-01-03   AAOI     20          False
2018-01-04   AAOI     20          False
2018-01-05   AAOI     20          False
2018-01-06   AAOI     20           True
2018-01-05   AMZN      5          False
2018-01-06   AMZN      5          False
2018-01-07   AMZN      5           True
2018-01-03     FB     34          False
2018-01-04     FB     34          False
2018-01-05     FB     34          False
2018-01-06     FB     34          False
2018-01-07     FB     34          False
2018-01-08     FB     34          False
2018-01-07     SQ     25          False
2018-01-08     SQ     25          False
2018-01-08    TPL     31          False

这是你想要的吗

相关问题 更多 >