基于另一列的平均值填充一列的值

3条回答

网友

1楼 · 编辑于 2024-10-02 04:23:29

所谓“相应的水平”，我假设你的意思是相等的截面值。在

如果是这样，你可以通过

for section_value in sorted(set(df.Section)):

    df.loc[df['Section']==section_value, 'Price'] = df.loc[df['Section']==section_value, 'Price'].fillna(df.loc[df['Section']==section_value, 'Price'].mean())

希望有帮助！和平

网友

2楼 · 编辑于 2024-10-02 04:23:29

您可以使用组合groupby、transform和{}。请注意，我已经修改了您的示例，否则这两个部分的平均值相同。从开始

In [21]: df
Out[21]: 
    Name Sex  Section  Price
0    Joe   M        1    2.0
1    Bob   M        1    NaN
2  Nancy   F        2    5.0
3  Grace   F        1    6.0
4    Jen   F        2   10.0
5   Paul   M        2    NaN

我们可以利用

^{pr2}$

生产

In [23]: df
Out[23]: 
    Name Sex  Section  Price
0    Joe   M        1    2.0
1    Bob   M        1    4.0
2  Nancy   F        2    5.0
3  Grace   F        1    6.0
4    Jen   F        2   10.0
5   Paul   M        2    7.5

这是因为我们可以通过截面计算平均值：

In [29]: df.groupby("Section")["Price"].mean()
Out[29]: 
Section
1    4.0
2    7.5
Name: Price, dtype: float64

并将此广播回一个完整的系列，我们可以使用transform传递给fillna（）：

In [30]: df.groupby("Section")["Price"].transform("mean")
Out[30]: 
0    4.0
1    4.0
2    7.5
3    4.0
4    7.5
5    7.5
Name: Price, dtype: float64

网友

3楼 · 编辑于 2024-10-02 04:23:29

`pandas`外科手术，但速度较慢

请参阅@DSM的答案以获得更快的`pandas`解决方案

这是一种更为外科手术的方法，可能提供一些视角，可能有用

使用groupyby
- 为每个Section计算我们的mean
```
means = df.groupby('Section').Price.mean()
```
识别空值
- 使用isnull可用于布尔切片
```
nulls = df.Price.isnull()
```
使用map
- 对Section列进行切片，将其限制为空Price的行
```
fills = df.Section[nulls].map(means)
```
使用loc
- 只填充df中有空的地方
```
df.loc[nulls, 'Price'] = fills
```

一起

means = df.groupby('Section').Price.mean()
nulls = df.Price.isnull()
fills = df.Section[nulls].map(means)
df.loc[nulls, 'Price'] = fills

print(df)

    Name Sex  Section  Price
0    Joe   M        1    2.0
1    Bob   M        1    4.0
2  Nancy   F        2    5.0
3  Grace   F        1    6.0
4    Jen   F        2   10.0
5   Paul   M        2    7.5

`pandas`外科手术，但速度较慢

请参阅@DSM的答案以获得更快的`pandas`解决方案

一起

相关问题更多 >

编程相关推荐

热门问题

热门文章

基于另一列的平均值填充一列的值

pandas外科手术，但速度较慢

请参阅@DSM的答案以获得更快的pandas解决方案

一起

相关问题 更多 >

编程相关推荐

热门问题

热门文章

`pandas`外科手术，但速度较慢

请参阅@DSM的答案以获得更快的`pandas`解决方案

相关问题更多 >