当数据帧位于同一单元格上时,如何按条件用另一个数据帧填充数据帧并求和?

2024-10-03 21:32:23 发布

您现在位置:Python中文网/ 问答频道 /正文

有两个数据帧df1和df2,如下所示

df1说明了每个列名称的数量数据。因此,每列将包含从楼层到下方的数量

    Column name    Floor    Under  ConcType   ConcTotal     RebarTotal     SteelTotal
0   CB1            ACP      GF     C80        21.6          7954.8         2265.0
1   CB1A           B4F      LGF    C60        1.2           318.8          0.0
...   
201 CT12B          GF       8F     C60        4.5           757.6         840.0
202 CT13A          GF       8F     C80        1.4           429.3         420.0

df2是单独楼层填充数量数据的一种形式。但df2在Fl上不同。名称_h(列)平均值将包含承载该楼层的所有列形式df1的数量

H承载该楼层的立柱的平均高度。按标高楼层-上一楼层标高计算

例如,F8层的H为48400-42800=5600

    Fl.Name_Elev    Elevation    Fl.Name_h   H         C60_Conc  C80_Conc   Rebar   Steel
0   8F              48400        8F          5600.0    NaN       NaN        NaN     NaN 
1   7F              42800        7F          6000.0    NaN       NaN        NaN     NaN
2   6F              36800        6F          6000.0    NaN       NaN        NaN     NaN
3   5F              30800        5F          5600.0    NaN       NaN        NaN     NaN
4   4F              25200        4F          5600.0    NaN       NaN        NaN     NaN
5   3F              19600        3F          5600.0    NaN       NaN        NaN     NaN
6   2F              14000        2F          6000.0    NaN       NaN        NaN     NaN
7   1F              8000         1F          6000.0    NaN       NaN        NaN     NaN
8   GF              2000         GF          3300.0    NaN       NaN        NaN     NaN
9   LGF MZ          -1300        LGF MZ      3150.0    NaN       NaN        NaN     NaN 
10  LGF             -4450        LGF         4650.0    NaN       NaN        NaN     NaN
11  B1F             -9100        B1F         3000.0    NaN       NaN        NaN     NaN
12  B2F             -12100       B2F         3000.0    NaN       NaN        NaN     NaN
13  B3F             -15100       B3F         4650.0    NaN       NaN        NaN     NaN 
14  B4F             -19750       B4F         8250.0    NaN       NaN        NaN     NaN
15  ACP             -28000       NaN         NaN       NaN       NaN        NaN     NaN

我想从df1填充df2中“C60_Conc”、“C80_Conc”、“钢筋”、“钢”的NaN

结果如下所示:

Fl.Name_Elev    Elevation    Fl.Name_h       H         C60_Conc C80_Conc    Rebar    Steel
0   8F              48400        8F          5600.0    4.5      1.4         1186.9   1260   
1   7F              42800        7F          6000.0    4.5      1.4         1186.9   1260
2   6F              36800        6F          6000.0    4.5      1.4         1186.9   1260
3   5F              30800        5F          5600.0    4.5      1.4         1186.9   1260
4   4F              25200        4F          5600.0    4.5      1.4         1186.9   1260
5   3F              19600        3F          5600.0    4.5      1.4         1186.9   1260
6   2F              14000        2F          6000.0    4.5      1.4         1186.9   1260
7   1F              8000         1F          6000.0    4.5      1.4         1186.9   1260
8   GF              2000         GF          3300.0    0        21.6        7954.8   2265
9   LGF MZ          -1300        LGF MZ      3150.0    0        21.6        7954.8   2265   
10  LGF             -4450        LGF         4650.0    1.2      21.6        8276.6   2265
11  B1F             -9100        B1F         3000.0    1.2      21.6        8276.6   2265
12  B2F             -12100       B2F         3000.0    1.2      21.6        8276.6   2265
13  B3F             -15100       B3F         4650.0    1.2      21.6        8276.6   2265   
14  B4F             -19750       B4F         8250.0    0        21.6        7954.8   2265
15  ACP             -28000       NaN         NaN       NaN      NaN         NaN      NaN

另外,很抱歉我的沟通。英语不是我的母语


Tags: name数量nandf1df2gfmzfl
1条回答
网友
1楼 · 发布于 2024-10-03 21:32:23

解决方案:

floors = df2["Fl.Name_Elev"].tolist()[::-1]
df1["Floors"] = df1.apply(lambda x: floors[floors.index(x["Floor"]):floors.index(x["Under"])+1], axis=1)
df1 = df1.explode(column=["Floors"])
mapper = df1.pivot(index=df1.columns.difference(["ConcType", "ConcTotal"]), 
                   columns="ConcType", 
                   values="ConcTotal").reset_index().groupby("Floors").sum()
output = df2.merge(mapper, left_on="Fl.Name_Elev", right_index=True, how="left")
output = output.rename(columns={"C60": "C60_Conc", "C80": "C80_Conc"})
>>> output
   Fl.Name_Elev  Elevation Fl.Name_h  ...  SteelTotal  C60_Conc  C80_Conc
0            8F      48400        8F  ...        1260       4.5       1.4
1            7F      42800        7F  ...        1260       4.5       1.4
2            6F      36800        6F  ...        1260       4.5       1.4
3            5F      30800        5F  ...        1260       4.5       1.4
4            4F      25200        4F  ...        1260       4.5       1.4
5            3F      19600        3F  ...        1260       4.5       1.4
6            2F      14000        2F  ...        1260       4.5       1.4
7            1F       8000        1F  ...        1260       4.5       1.4
8            GF       2000        GF  ...        3525       4.5      23.0
9        LGF MZ      -1300    LGF MZ  ...        2265       0.0      21.6
10          LGF      -4450       LGF  ...        2265       1.2      21.6
11          B1F      -9100       B1F  ...        2265       1.2      21.6
12          B2F     -12100       B2F  ...        2265       1.2      21.6
13          B3F     -15100       B3F  ...        2265       1.2      21.6
14          B4F     -19750       B4F  ...        2265       1.2      21.6
15          ACP     -28000       NaN  ...        2265       0.0      21.6

解释

首先,我们使用pandas.explode将df1中从地板到下的所有地板包括在内:

floors = df2["Fl.Name_Elev"].tolist()[::-1]
df1["Floors"] = df1.apply(lambda x: floors[floors.index(x["Floor"]):floors.index(x["Under"])+1], axis=1)
df1 = df1.explode(column=["Floors"])
>>> df1.head()
  Column name Floor Under ConcType  ConcTotal  RebarTotal  SteelTotal  Floors
0         CB1   ACP    GF      C80       21.6      7954.8        2265     ACP
0         CB1   ACP    GF      C80       21.6      7954.8        2265     B4F
0         CB1   ACP    GF      C80       21.6      7954.8        2265     B3F
0         CB1   ACP    GF      C80       21.6      7954.8        2265     B2F
0         CB1   ACP    GF      C80       21.6      7954.8        2265     B1F
0         CB1   ACP    GF      C80       21.6      7954.8        2265     LGF
0         CB1   ACP    GF      C80       21.6      7954.8        2265  LGF MZ
0         CB1   ACP    GF      C80       21.6      7954.8        2265      GF
1        CB1A   B4F   LGF      C60        1.2       318.8           0     B4F
1        CB1A   B4F   LGF      C60        1.2       318.8           0     B3F

然后,我们通过使用pandas.pivot来创建一个映射器,以获得C60和C80的单独列,并使用pandas.groupbysum来获得每个楼层的总数:

mapper = df1.pivot(index=df1.columns.difference(["ConcType", "ConcTotal"]),
                   columns="ConcType",
                   values="ConcTotal") \
            .reset_index() \
            .groupby("Floor").sum()
>>> mapper
ConcType  RebarTotal  SteelTotal  C60   C80
Floors                                     
1F            1186.9        1260  4.5   1.4
2F            1186.9        1260  4.5   1.4
3F            1186.9        1260  4.5   1.4
4F            1186.9        1260  4.5   1.4
5F            1186.9        1260  4.5   1.4
6F            1186.9        1260  4.5   1.4
7F            1186.9        1260  4.5   1.4
8F            1186.9        1260  4.5   1.4
ACP           7954.8        2265  0.0  21.6
B1F           8273.6        2265  1.2  21.6
B2F           8273.6        2265  1.2  21.6
B3F           8273.6        2265  1.2  21.6
B4F           8273.6        2265  1.2  21.6
GF            9141.7        3525  4.5  23.0
LGF           8273.6        2265  1.2  21.6
LGF MZ        7954.8        2265  0.0  21.6

最后,我们在df2和mapper上使用pandas.merge来获得输出:

output = df2.merge(mapper, left_on="Fl.Name_Elev", right_index=True, how="left")
output = output.rename(columns={"C60": "C60_Conc", "C80": "C80_Conc"})

相关问题 更多 >