合并具有不同大小形状和列名的DF,以及没有重复项的freq

2024-10-01 07:42:54 发布

您现在位置:Python中文网/ 问答频道 /正文

有两列共同索引的DF(团队名称(32个团队),2018-2015年);DF1有9列球队年度NFL统计数据平均值,DF2有相同的指数球队(32)和年份(2018-2015),但有11列17场比赛(或“周”)的差异统计数据::所以我试图合并每个球队和年份,它打印该球队和年份的年度平均值第一或(DF1的9列,一行),其次是DF2的每17行11列数据——每队和每一年的比赛(“周”)指数

lvl0 = result.Tm_name.values
lvl1 = result.Year.values
newidx = pd.MultiIndex.from_arrays([lvl0, lvl1], names = ["Tm_name", "Year"])
result.set_index(newidx, inplace = True)
result.drop(["Year", "Tm_name"], axis = 1, inplace = True)
print(result)


                W     L   W_L_Pct   PD     MoV  SoS   SRS  OSRS  DSRS
Tm_name Year                                                         
1       2015  13.0   3.0   0.813   176.0  11.0  1.3  12.3   9.0   3.4
        2016   7.0   8.0   0.469    56.0   3.5 -1.9   1.6   2.4  -0.8
        2017   8.0   8.0   0.500   -66.0  -4.1  0.4  -3.7  -4.0   0.2
        2018   3.0  13.0   0.188  -200.0 -12.5  1.0 -11.5  -9.6  -1.9
2       2015   8.0   8.0   0.500    -6.0  -0.4 -3.4  -3.8  -4.0   0.3
        2016  11.0   5.0   0.688   134.0   8.4  0.1   8.5  10.5  -2.0
        2017  10.0   6.0   0.625    38.0   2.4  1.9   4.3   1.1   3.2
        2018   7.0   9.0   0.438    -9.0  -0.6  0.4  -0.1   2.5  -2.6
3       2015   5.0  11.0   0.313   -73.0  -4.6  2.6  -1.9  -0.7  -1.2
        2016   8.0   8.0   0.500    22.0   1.4  0.2   1.5  -1.1   2.6
        2017   9.0   7.0   0.563    92.0   5.8 -2.4   3.4   2.2   1.2
        2018  10.0   6.0   0.625   102.0   6.4  0.6   7.0   0.6   6.4
4       2015   8.0   8.0   0.500    20.0   1.3 -1.2   0.0   0.3  -0.2
        2016   7.0   9.0   0.438    21.0   1.3 -1.6  -0.3   1.8  -2.2
        2017   9.0   7.0   0.563   -57.0  -3.6 -0.5  -4.0  -3.0  -1.0
        2018   6.0  10.0   0.375  -105.0  -6.6 -0.3  -6.9  -6.3  -0.6
5       2015  15.0   1.0   0.938   192.0  12.0 -3.9   8.1   6.0   2.1
        2016   6.0  10.0   0.375   -33.0  -2.1  1.1  -1.0  -0.2  -0.8
        2017  11.0   5.0   0.688    36.0   2.3  2.1   4.3   1.7   2.7
        2018   7.0   9.0   0.438    -6.0  -0.4  1.3   0.9   0.1   0.8
6       2015   6.0  10.0   0.375   -62.0  -3.9  2.6  -1.3  -0.1  -1.2
        2016   3.0  13.0   0.188  -120.0  -7.5  0.0  -7.5  -5.2  -2.3
        2017   5.0  11.0   0.313   -56.0  -3.5  2.2  -1.3  -4.6   3.3
        2018  12.0   4.0   0.750   138.0   8.6 -2.3   6.3   1.5   4.8
7       2015  12.0   4.0   0.750   140.0   8.8  1.9  10.6   4.8   5.8


lvl_0 = result2.Tm_name.values
lvl_1 = result2.Year.values
newidx_2 = newidx = pd.MultiIndex.from_arrays([lvl_0, lvl_1], names=["Tm_name", "Year"])
result2.set_index(newidx, inplace=True)
result2.drop(["Year", "Tm_name"], axis=1, inplace=True)
print(result2)

             Week    Date     win_loss  home_away  Opp1_team  Tm_Pnts  \
Tm_name Year                                                            
1       2018   1  2018-09-09     0.0       1.0       32.0       6.0     
        2018   2  2018-09-16     0.0       0.0       18.0       0.0     
        2018   3  2018-09-23     0.0       1.0        6.0      14.0     
        2018   4  2018-09-30     0.0       1.0       28.0      17.0     
        2018   5  2018-10-07     1.0       0.0       29.0      28.0     
        2018   6  2018-10-14     0.0       0.0       20.0      17.0     
        2018   7  2018-10-18     0.0       1.0       10.0      10.0     
        2018   8  2018-10-28     1.0       1.0       29.0      18.0     
        2018  10  2018-11-11     0.0       0.0       16.0      14.0     
        2018  11  2018-11-18     0.0       1.0       25.0      21.0     
        2018  12  2018-11-25     0.0       0.0       17.0      10.0     
        2018  13  2018-12-02     1.0       0.0       12.0      20.0     
        2018  14  2018-12-09     0.0       1.0       11.0       3.0     
        2018  15  2018-12-16     0.0       0.0        2.0      14.0     
        2018  16  2018-12-23     0.0       1.0       18.0       9.0     
        2018  17  2018-12-30     0.0       0.0       28.0      24.0     
        2017   1  2017-09-10     0.0       0.0       11.0      23.0     
        2017   2  2017-09-17     1.0       0.0       14.0      16.0     
        2017   3  2017-09-25     0.0       1.0        9.0      17.0     
        2017   4  2017-10-01     1.0       1.0       29.0      18.0     
        2017   5  2017-10-08     0.0       0.0       26.0       7.0     
        2017   6  2017-10-15     1.0       1.0       30.0      38.0     
        2017   7  2017-10-22     0.0       0.0       18.0       0.0     
        2017   9  2017-11-05     1.0       0.0       29.0      20.0     
        2017  10  2017-11-09     0.0       1.0       28.0      16.0     
        2017  11  2017-11-19     0.0       0.0       13.0      21.0     
        2017  12  2017-11-26     1.0       1.0       15.0      27.0     
        2017  13  2017-12-03     0.0       1.0       18.0      16.0     
        2017  14  2017-12-10     1.0       1.0       31.0      12.0     
        2017  15  2017-12-17     0.0       0.0       32.0      15.0     
...           ...        ...       ...        ...        ...      ...   
        2016   5  2016-10-06     1.0       0.0       29.0      33.0     
        2016   6  2016-10-17     1.0       1.0       24.0      28.0     
        2016   7  2016-10-23     NaN       1.0       28.0       6.0     
        2016   8  2016-10-30     0.0       0.0        5.0      20.0     
        2016  10  2016-11-13     1.0       1.0       29.0      23.0     
        2016  11  2016-11-20     0.0       0.0       20.0      24.0     
        2016  12  2016-11-27     0.0       0.0        2.0      19.0     
        2016  13  2016-12-04     1.0       1.0       32.0      31.0     
        2016  14  2016-12-11     0.0       0.0       19.0      23.0     
        2016  15  2016-12-18     0.0       1.0       22.0      41.0     
        2016  16  2016-12-24     1.0       0.0       28.0      34.0     
        2016  17  2016-01-01     1.0       0.0       18.0      44.0     
        2015   1  2015-09-13     1.0       1.0       22.0      31.0     
        2015   2  2015-09-20     1.0       0.0        6.0      48.0     
        2015   3  2015-09-27     1.0       1.0       29.0      47.0     
        2015   4  2015-10-04     0.0       1.0        NaN      22.0     
        2015   5  2015-10-11     1.0       0.0       11.0      42.0     
        2015   6  2015-10-18     0.0       0.0       27.0      13.0     
        2015   7  2015-10-26     1.0       1.0        3.0      26.0     
        2015   8  2015-11-01     1.0       0.0        8.0      34.0     
        2015  10  2015-11-15     1.0       0.0       28.0      39.0     
        2015  11  2015-11-22     1.0       1.0        7.0      34.0     
        2015  12  2015-11-29     1.0       0.0       29.0      19.0     
        2015  13  2015-12-06     1.0       0.0        NaN      27.0     
        2015  14  2015-12-10     1.0       1.0       20.0      23.0     
        2015  15  2015-12-20     1.0       0.0       26.0      40.0     
        2015  16  2015-12-27     1.0       1.0       12.0      38.0     
        2015  17  2015-01-03     0.0       1.0       28.0       6.0     
2       2018   1  2018-09-06     0.0       0.0       26.0      12.0     
        2018   2  2018-09-16     1.0       1.0        5.0      31.0     

              Opp2_pnts  Off_1stD  Off_TotYd  Def_1stD_All  Def_TotYd_All  
Tm_name Year                                                               
1       2018    24.0       14.0      213.0        30.0          429.0      
        2018    34.0        5.0      137.0        24.0          432.0      
        2018    16.0       13.0      221.0        21.0          316.0      
        2018    20.0       18.0      263.0        19.0          331.0      
        2018    18.0       10.0      220.0        33.0          447.0      
        2018    27.0       16.0      268.0        20.0          411.0      
        2018    45.0       14.0      223.0        15.0          309.0      
        2018    15.0       20.0      321.0        16.0          267.0      
        2018    26.0       21.0      260.0        20.0          330.0      
        2018    23.0       13.0      282.0        19.0          325.0      
        2018    45.0       10.0      149.0        30.0          414.0      
        2018    17.0       18.0      315.0        22.0          325.0      
        2018    17.0       22.0      279.0        16.0          218.0      
        2018    40.0       18.0      253.0        23.0          435.0      
        2018    31.0       15.0      263.0        33.0          461.0      
        2018    27.0       12.0      198.0        16.0          291.0      
        2017    35.0       24.0      308.0        19.0          367.0      
        2017    13.0       17.0      389.0        18.0          266.0      
        2017    28.0       22.0      332.0        15.0          273.0      
        2017    15.0       25.0      368.0        20.0          305.0      
        2017    34.0       16.0      307.0        19.0          419.0      
        2017    33.0       23.0      432.0        21.0          412.0      
        2017    33.0       10.0      196.0        28.0          425.0      
        2017    10.0       20.0      368.0        17.0          329.0      
        2017    22.0       24.0      290.0        14.0          287.0      
        2017    31.0       17.0      292.0        22.0          357.0      
        2017    24.0       20.0      344.0        19.0          219.0      
        2017    32.0       19.0      305.0        18.0          303.0      
        2017     7.0       16.0      261.0        14.0          204.0      
        2017    20.0       19.0      286.0        14.0          218.0      
...                 ...       ...        ...           ...            ...  
        2016    21.0       17.0      288.0        25.0          286.0      
        2016     3.0       28.0      396.0        11.0          230.0      
        2016     6.0       23.0      443.0        11.0          257.0      
        2016    30.0       22.0      340.0        19.0          349.0      
        2016    20.0       26.0      443.0        15.0          281.0      
        2016    30.0       24.0      290.0        16.0          217.0      
        2016    38.0       23.0      332.0        28.0          360.0      
        2016    23.0       24.0      369.0        19.0          333.0      
        2016    26.0       21.0      300.0        15.0          314.0      
        2016    48.0       26.0      425.0        33.0          488.0      
        2016    31.0       21.0      370.0        24.0          391.0      
        2016     6.0       21.0      344.0         9.0          123.0      
        2015    19.0       25.0      427.0        18.0          408.0      
        2015    23.0       21.0      300.0        18.0          335.0      
        2015     7.0       28.0      446.0        10.0          156.0      
        2015    24.0       26.0      447.0        13.0          328.0      
        2015    17.0       15.0      345.0        29.0          435.0      
        2015    25.0       21.0      469.0        14.0          310.0      
        2015    18.0       21.0      414.0        18.0          276.0      
        2015    20.0       25.0      491.0        16.0          254.0      
        2015    32.0       30.0      451.0        18.0          343.0      
        2015    31.0       21.0      383.0        24.0          377.0      
        2015    13.0       26.0      337.0        17.0          368.0      
        2015     3.0       29.0      524.0         9.0          212.0      
        2015    20.0       22.0      393.0        23.0          389.0      
        2015    17.0       28.0      493.0        19.0          424.0      
        2015     8.0       19.0      381.0        16.0          178.0      
        2015    36.0       16.0      232.0        22.0          354.0      
2       2018    18.0       16.0      299.0        18.0          232.0      
        2018    24.0       23.0      442.0        27.0          439.0

--我不能只是合并它们--我尝试过许多不同的方法,但结果总是不对,所以我想我可以进入一个列表,从数组中生成一个数据帧。这个数组看起来像我想要的,但是当我把它放到一个数据帧中时,它不会这样做{下面检查}

app = []

for row in result.itertuples():
    app.append(row)
    for row_1 in result2.itertuples():
        if row[0] == row_1[0]:
            app.append(row_1)

9.0, Off_1stD=25.0, Off_TotYd=427.0, Def_1stD_All=18.0, Def_TotYd_All=408.0)
Pandas(Index=(1, 2015), Week='2', Date=Timestamp('2015-09-20 00:00:00'), win_loss=1.0, home_away=0.0, Opp1_team=6.0, Tm_Pnts=48.0, Opp2_pnts=23.0, Off_1stD=21.0, Off_TotYd=300.0, Def_1stD_All=18.0, Def_TotYd_All=335.0)
Pandas(Index=(1, 2015), Week='3', Date=Timestamp('2015-09-27 00:00:00'), win_loss=1.0, home_away=1.0, Opp1_team=29.0, Tm_Pnts=47.0, Opp2_pnts=7.0, Off_1stD=28.0, Off_TotYd=446.0, Def_1stD_All=10.0, Def_TotYd_All=156.0)
Pandas(Index=(1, 2015), Week='4', Date=Timestamp('2015-10-04 00:00:00'), win_loss=0.0, home_away=1.0, Opp1_team=nan, Tm_Pnts=22.0, Opp2_pnts=24.0, Off_1stD=26.0, Off_TotYd=447.0, Def_1stD_All=13.0, Def_TotYd_All=328.0)
Pandas(Index=(1, 2015), Week='5', Date=Timestamp('2015-10-11 00:00:00'), win_loss=1.0, home_away=0.0, Opp1_team=11.0, Tm_Pnts=42.0, Opp2_pnts=17.0, Off_1stD=15.0, Off_TotYd=345.0, Def_1stD_All=29.0, Def_TotYd_All=435.0)
Pandas(Index=(1, 2015), Week='6', Date=Timestamp('2015-10-18 00:00:00'), win_loss=0.0, home_away=0.0, Opp1_team=27.0, Tm_Pnts=13.0, Opp2_pnts=25.0, Off_1stD=21.0, Off_TotYd=469.0, Def_1stD_All=14.0, Def_TotYd_All=310.0)
Pandas(Index=(1, 2015), Week='7', Date=Timestamp('2015-10-26 00:00:00'), win_loss=1.0, home_away=1.0, Opp1_team=3.0, Tm_Pnts=26.0, Opp2_pnts=18.0, Off_1stD=21.0, Off_TotYd=414.0, Def_1stD_All=18.0, Def_TotYd_All=276.0)
Pandas(Index=(1, 2015), Week='8', Date=Timestamp('2015-11-01 00:00:00'), win_loss=1.0, home_away=0.0, Opp1_team=8.0, Tm_Pnts=34.0, Opp2_pnts=20.0, Off_1stD=25.0, Off_TotYd=491.0, Def_1stD_All=16.0, Def_TotYd_All=254.0)
Pandas(Index=(1, 2015), Week='10', Date=Timestamp('2015-11-15 00:00:00'), win_loss=1.0, home_away=0.0, Opp1_team=28.0, Tm_Pnts=39.0, Opp2_pnts=32.0, Off_1stD=30.0, Off_TotYd=451.0, Def_1stD_All=18.0, Def_TotYd_All=343.0)
Pandas(Index=(1, 2015), Week='11', Date=Timestamp('2015-11-22 00:00:00'), win_loss=1.0, home_away=1.0, Opp1_team=7.0, Tm_Pnts=34.0, Opp2_pnts=31.0, Off_1stD=21.0, Off_TotYd=383.0, Def_1stD_All=24.0, Def_TotYd_All=377.0)
Pandas(Index=(1, 2015), Week='12', Date=Timestamp('2015-11-29 00:00:00'), win_loss=1.0, home_away=0.0, Opp1_team=29.0, Tm_Pnts=19.0, Opp2_pnts=13.0, Off_1stD=26.0, Off_TotYd=337.0, Def_1stD_All=17.0, Def_TotYd_All=368.0)
Pandas(Index=(1, 2015), Week='13', Date=Timestamp('2015-12-06 00:00:00'), win_loss=1.0, home_away=0.0, Opp1_team=nan, Tm_Pnts=27.0, Opp2_pnts=3.0, Off_1stD=29.0, Off_TotYd=524.0, Def_1stD_All=9.0, Def_TotYd_All=212.0)
Pandas(Index=(1, 2015), Week='14', Date=Timestamp('2015-12-10 00:00:00'), win_loss=1.0, home_away=1.0, Opp1_team=20.0, Tm_Pnts=23.0, Opp2_pnts=20.0, Off_1stD=22.0, Off_TotYd=393.0, Def_1stD_All=23.0, Def_TotYd_All=389.0)
Pandas(Index=(1, 2015), Week='15', Date=Timestamp('2015-12-20 00:00:00'), win_loss=1.0, home_away=0.0, Opp1_team=26.0, Tm_Pnts=40.0, Opp2_pnts=17.0, Off_1stD=28.0, Off_TotYd=493.0, Def_1stD_All=19.0, Def_TotYd_All=424.0)
Pandas(Index=(1, 2015), Week='16', Date=Timestamp('2015-12-27 00:00:00'), win_loss=1.0, home_away=1.0, Opp1_team=12.0, Tm_Pnts=38.0, Opp2_pnts=8.0, Off_1stD=19.0, Off_TotYd=381.0, Def_1stD_All=16.0, Def_TotYd_All=178.0)
Pandas(Index=(1, 2015), Week='17', Date=Timestamp('2015-01-03 00:00:00'), win_loss=0.0, home_away=1.0, Opp1_team=28.0, Tm_Pnts=6.0, Opp2_pnts=36.0, Off_1stD=16.0, Off_TotYd=232.0, Def_1stD_All=22.0, Def_TotYd_All=354.0)
Pandas(Index=(1, 2016), W=7.0, L=8.0, W_L_Pct=0.469, PD=56.0, MoV=3.5, SoS=-1.9, SRS=1.6, OSRS=2.4, DSRS=-0.8)
Pandas(Index=(1, 2016), Week=1, Date=Timestamp('2016-09-11 00:00:00'), win_loss=0.0, home_away=1.0, Opp1_team=21.0, Tm_Pnts=21.0, Opp2_pnts=23.0, Off_1stD=21.0, Off_TotYd=344.0, Def_1stD_All=19.0, Def_TotYd_All=363.0)
Pandas(Index=(1, 2016), Week=2, Date=Timestamp('2016-09-18 00:00:00'), win_loss=1.0, home_away=1.0, Opp1_team=30.0, Tm_Pnts=40.0, Opp2_pnts=7.0, Off_1stD=20.0, Off_TotYd=416.0, Def_1stD_All=21.0, Def_TotYd_All=306.0)
Pandas(Index=(1, 2016), Week=3, Date=Timestamp('2016-09-25 00:00:00'), win_loss=0.0, home_away=0.0, Opp1_team=4.0, Tm_Pnts=18.0, Opp2_pnts=33.0, Off_1stD=25.0, Off_TotYd=348.0, Def_1stD_All=16.0, Def_TotYd_All=297.0)
Pandas(Index=(1, 2016), Week=4, Date=Timestamp('2016-10-02 00:00:00'), win_loss=0.0, home_away=1.0, Opp1_team=18.0, Tm_Pnts=13.0, Opp2_pnts=17.0, Off_1stD=26.0, Off_TotYd=420.0, Def_1stD_All=12.0, Def_TotYd_All=288.0)
Pandas(Index=(1, 2016), Week=5, Date=Timestamp('2016-10-06 00:00:00'), win_loss=1.0, home_away=0.0, Opp1_team=29.0, Tm_Pnts=33.0, Opp2_pnts=21.0, Off_1stD=17.0, Off_TotYd=288.0, Def_1stD_All=25.0, Def_TotYd_All=286.0)
Pandas(Index=(1, 2016), Week=6, Date=Timestamp('2016-10-17 00:00:00'), win_loss=1.0, home_away=1.0, Opp1_team=24.0, Tm_Pnts=28.0, Opp2_pnts=3.0, Off_1stD=28.0, Off_TotYd=396.0, Def_1stD_All=11.0, Def_TotYd_All=230.0)
Pandas(Index=(1, 2016), Week=7, Date=Timestamp('2016-10-23 00:00:00'), win_loss=nan, home_away=1.0, Opp1_team=28.0, Tm_Pnts=6.0, Opp2_pnts=6.0, Off_1stD=23.0, Off_TotYd=443.0, Def_1stD_All=11.0, Def_TotYd_All=257.0)
Pandas(Index=(1, 2016), Week=8, Date=Timestamp('2016-10-30 00:00:00'), win_loss=0.0, home_away=0.0, Opp1_team=5.0, Tm_Pnts=20.0, Opp2_pnts=30.0, Off_1stD=22.0, Off_TotYd=340.0, Def_1stD_All=19.0, Def_TotYd_All=349.0)
Pandas(Index=(1, 2016), Week=10, Date=Timestamp('2016-11-13 00:00:00'), win_loss=1.0, home_away=1.0, Opp1_team=29.0, Tm_Pnts=23.0, Opp2_pnts=20.0, Off_1stD=26.0, Off_TotYd=443.0, Def_1stD_All=15.0, Def_TotYd_All=281.0)
Pandas(Index=(1, 2016), Week=11, Date=Timestamp('2016-11-20 00:00:00'), win_loss=0.0, home_away=0.0, Opp1_team=20.0, Tm_Pnts=24.0, Opp2_pnts=30.0, Off_1stD=24.0, Off_TotYd=290.0, Def_1stD_All=16.0, Def_TotYd_All=217.0)
Pandas(Index=(1, 2016), Week=12, Date=Timestamp('2016-11-27 00:00:00'), win_loss=0.0, home_away=0.0, Opp1_team=2.0, Tm_Pnts=19.0, Opp2_pnts=38.0, Off_1stD=23.0, Off_TotYd=332.0, Def_1stD_All=28.0, Def_TotYd_All=360.0)
Pandas(Index=(1, 2016), Week=13, Date=Timestamp('2016-12-04 00:00:00'), win_loss=1.0, home_away=1.0, Opp1_team=32.0, Tm_Pnts=31.0, Opp2_pnts=23.0, Off_1stD=24.0, Off_TotYd=369.0, Def_1stD_All=19.0, Def_TotYd_All=333.0)
Pandas(Index=(1, 2016), Week=14, Date=Timestamp('2016-12-11 00:00:00'), win_loss=0.0, home_away=0.0, Opp1_team=19.0, Tm_Pnts=23.0, Opp2_pnts=26.0, Off_1stD=21.0, Off_TotYd=300.0, Def_1stD_All=15.0, Def_TotYd_All=314.0)
Pandas(Index=(1, 2016), Week=15, Date=Timestamp('2016-12-18 00:00:00'), win_loss=0.0, home_away=1.0, Opp1_team=22.0, Tm_Pnts=41.0, Opp2_pnts=48.0, Off_1stD=26.0, Off_TotYd=425.0, Def_1stD_All=33.0, Def_TotYd_All=488.0)
Pandas(Index=(1, 2016), Week=16, Date=Timestamp('2016-12-24 00:00:00'), win_loss=1.0, home_away=0.0, Opp1_team=28.0, Tm_Pnts=34.0, Opp2_pnts=31.0, Off_1stD=21.0, Off_TotYd=370.0, Def_1stD_All=24.0, Def_TotYd_All=391.0)
Pandas(Index=(1, 2016), Week=17, Date=Timestamp('2016-01-01 00:00:00'), win_loss=1.0, home_away=0.0, Opp1_team=18.0, Tm_Pnts=44.0, Opp2_pnts=6.0, Off_1stD=21.0, Off_TotYd=344.0, Def_1stD_All=9.0, Def_TotYd_All=123.0)....

这就是我希望每年的设置,从一个连续的年平均数开始,然后是那一年每场比赛的17行数据。---现在当我试着放入一个数据框

new = pd.DataFrame(app, index=["Tm_name", "Year"])
AssertionError: 10 cols passed, passed data had 12 cols

有人能帮我玩了两个星期,尝试了多索引,差异合并,合并,只是似乎不能让它看起来像应用程序阵列,没有重复

再次感谢


Tags: homedatedefallwinteamtmweek