2024-09-28 20:43:46 发布
网友
我有一个以id和month作为索引的多数据帧
id
month
对于每个id(索引1),我希望能够将month(索引2)一直切片到amount1或amount2列中的最后一个非零值。在
期望输出
我试过将所有ID切片,但我不知道如何为每个ID切割不同的卡盘:
df.loc[:,:max(df[df['amount1'] != 0].index)[1]]
或许还有更有效的选择。但通过以下代码,您可以实现您想要的:
import pandas as pd # We create the original dataframe arrays = [[102,102,102,102,102,102,102,102,103,103,103,103,103,103,103,104,104,104,104,104,104,104,104,104,104], ["11/1/2004","12/1/2004","1/1/2005","2/1/2005","3/1/2005","4/1/2005","5/1/2005","6/1/2005","4/1/2003","5/1/2003","6/1/2003","7/1/2003","8/1/2003","9/1/2003","10/1/2003","8/1/2003","9/1/2003","10/1/2003","11/1/2003","12/1/2003","1/1/2004","2/1/2004","3/1/2004","4/1/2004","5/1/2004"]] tuples = list(zip(*arrays)) index = pd.MultiIndex.from_tuples(tuples, names=['id', 'month']) amount1 = [0,0,-9100000,0,1444.1,0,0,0,0,0,0,-5.4e7,0,0,0,0,0,0,0,-3.3e7,-4.3e7,0,0,0,0] amount2 = [1105.900001,0,1037.3,0,0,0,0,0,0,0,0,0,0,0,0,117.4199962,117.315,0,0,107.77771641,105.9499986,0,106.3398808,0,0] df = pd.DataFrame({"amount1": amount1, "amount2": amount2},index=index) # We slice the dataframe by ids df_out_list = list() for i,id in enumerate(df.index.levels[0]): df2 = df.xs((id,)) df2_nonzeros = df2[(df2['amount1'] != 0) | (df2['amount2'] != 0)] df2_result = df2[:df2_nonzeros.tail(1).index[0]] N = len(df2_result.index) arrays = [[id]*N, df2_result.index] tuples_result = list(zip(*arrays)) index_result = pd.MultiIndex.from_tuples(tuples_result, names=['id', 'month']) df_out_list.append(pd.DataFrame({"amount1": list(df2_result["amount1"]),"amount2": list(df2_result["amount2"])},index=index_result)) # We create the output dataframe appending the dataframes by id for i,df_el in enumerate(df_out_list): if i==0: df_out = df_el else: df_out = df_out.append(df_el) print df print df_out
这样的输出是:
或许还有更有效的选择。但通过以下代码,您可以实现您想要的:
这样的输出是:
相关问题 更多 >
编程相关推荐