找出给定数据中每列中缺少值的百分比

import pandas as pd df = pd.read_csv('https://query.data.world/s/Hfu_PsEuD1Z_yJHmGaxWTxvkz7W_b0') percent= 100*(len(df.loc[:,df.isnull().sum(axis=0)>=1 ].index) / len(df.index)) print(round(percent,2))

Ord_id 0.00 Prod_id 0.00 Ship_id 0.00 Cust_id 0.00 Sales 0.24 Discount 0.65 Order_Quantity 0.65 Profit 0.65 Shipping_Cost 0.65 Product_Base_Margin 1.30 dtype: float64

3条回答

网友

1楼 · 编辑于 2024-10-05 13:46:37

要覆盖所有缺失的值并舍入结果：

((df.isnull() | df.isna()).sum() * 100 / df.index.size).round(2)

输出：

Out[556]: 
Ord_id                 0.00
Prod_id                0.00
Ship_id                0.00
Cust_id                0.00
Sales                  0.24
Discount               0.65
Order_Quantity         0.65
Profit                 0.65
Shipping_Cost          0.65
Product_Base_Margin    1.30
dtype: float64

网友

2楼 · 编辑于 2024-10-05 13:46:37

更新让我们将mean与isnull一起使用：

df.isnull().mean() * 100

输出：

Ord_id                 0.000000
Prod_id                0.000000
Ship_id                0.000000
Cust_id                0.000000
Sales                  0.238124
Discount               0.654840
Order_Quantity         0.654840
Profit                 0.654840
Shipping_Cost          0.654840
Product_Base_Margin    1.297774
dtype: float64

IIUC:

df.isnull().sum() / df.shape[0] * 100.00

输出：

Ord_id                 0.000000
Prod_id                0.000000
Ship_id                0.000000
Cust_id                0.000000
Sales                  0.238124
Discount               0.654840
Order_Quantity         0.654840
Profit                 0.654840
Shipping_Cost          0.654840
Product_Base_Margin    1.297774
dtype: float64

网友

3楼 · 编辑于 2024-10-05 13:46:37

这个怎么样？我想我以前在这里发现过类似的东西，但现在我看不到了。。。

percent_missing = df.isnull().sum() * 100 / len(df)
missing_value_df = pd.DataFrame({'column_name': df.columns,
                                 'percent_missing': percent_missing})

如果您希望对缺少的百分比进行排序，请按照上面的操作：

missing_value_df.sort_values('percent_missing', inplace=True)

如注释中所述，您也可以只使用上面代码中的第一行，即：

percent_missing = df.isnull().sum() * 100 / len(df)

相关问题更多 >

编程相关推荐

热门问题

热门文章