我有一个csv文件,我已经用
df = pd.read_csv("af.csv")
CSV文件如下所示(预览):
"match_id","start_time","win","leaguename","opposing_team","team","min"
2992096687,1486840800,True,"CaptainsDraft",3729377,2642171,1453382256
2992217489,1486845476,true,"Captains Draft",3729377,2642171,1453382256
2994454005,1486926905,false,"Captains Draft",2586976,2642171,1453382256
2659805546,1474478411,false,"BTSSeries",55,2642171,1454281287
2659879628,1474481141,false,"BTSSeries",55,2642171,1454281287
2661783205,1474563571,false,"BTSSeries",2537636,2642171,1454281287
2661875544,1474566865,false,"BTSSeries",2537636,2642171,1454281287
2662027296,1474573160,true,"BTSSeries",59,2642171,1454281287
2758086417,1478352060,true,"ESLManila16",2163,2642171,1454692269
2758241073,1478355547,true,"ESLManila16",2163,2642171,1454692269
2747710178,1477941012,false,"ESLFrankfurt16",2850016,2642171,1459782261
2747808587,1477945318,true,"ESLFrankfurt16",2850016,2642171,1459782261
2747861268,1477947994,true,"ESLFrankfurt16",2850016,2642171,1459782261
现在我要做的是保持联赛的第一场比赛,然后是该联赛所有比赛的赢数(真是赢,假是输),然后按开始时间排序
我有以下代码来执行此操作:
df1 = df.groupby(['leaguename', 'team']).sum().reset_index()
df1 = df1[['win','leaguename','team']]
df2 = df.sort_values("start_time").groupby("leaguename", as_index=False).first()
df2 = df2[['leaguename', 'start_time']]
output = pd.merge(df1, df2, 'inner', on = 'leaguename')
输出返回混乱无序的开始时间:
,win,leaguename,team,start_time
0,5.0,ASUSROGSeason6,2642171,1478022101
1,6.0,CaptainsDraft,2642171,1486840800
2,3.0,Dota2Asia17,2642171,1486130597
3,2.0,DotaPitSeason5,2642171,1476903919
4,5.0,ESLFrankfurt16,2642171,1477941012
5,2.0,ESLManila16,2642171,1478352060
6,6.0,GlobalGrandMasters,2642171,1466176095
7,4.0,NanyangChampionshipsSeason2,2642171,1464178206
期望输出:
,win,leaguename,team,start_time
0,4.0,NanyangChampionshipsSeason2,2642171,1464178206
1,6.0,GlobalGrandMasters,2642171,1466176095
2,2.0,DotaPitSeason5,2642171,1476903919
3,5.0,ESLFrankfurt16,2642171,1477941012
4,5.0,ASUSROGSeason6,2642171,1478022101
5,2.0,ESLManila16,2642171,1478352060
6,3.0,Dota2Asia17,2642171,1486130597
7,6.0,CaptainsDraft,2642171,1486840800
我怎样才能达到预期的产出?你知道吗
我认为您需要^{} 按列} 和参数
start_time
使用^{drop=True
作为默认的唯一单调索引:另一种解决方案是将
sort=False
添加到两个groupby
:相关问题 更多 >
编程相关推荐