我试图找出哪家商店有“空”的一天,即没有顾客来的一天
我的表具有以下结构:
+----------+-------------+-------------+-------------+-------------+-------------+-------------+------------+
| shop | 2020-10-15 | 2020-10-16 | 2020-10-17 | 2020-10-18 | 2020-10-19 | 2020-10-20 | 2020-10-21 |
+----------+-------------+-------------+-------------+-------------+-------------+-------------+------------+
| Paris | 215 | 213 | 128 | 102 | 195 | 180 | 110 |
| London | 145 | 106 | 102 | 83 | 127 | 111 | 56 |
| Beijing | 179 | 245 | 134 | 136 | 207 | 183 | 136 |
| Sydney | 0 | 0 | 0 | 0 | 0 | 6 | 36 |
+----------+-------------+-------------+-------------+-------------+-------------+-------------+------------+
使用pandas,我可以执行类似customers[customers== 0].dropna(how="all")
的操作,这将只保留有0
的行,我得到以下结果:
+----------+-------------+-------------+-------------+-------------+-------------+-------------+------------+
| shop | 2020-10-15 | 2020-10-16 | 2020-10-17 | 2020-10-18 | 2020-10-19 | 2020-10-20 | 2020-10-21 |
+----------+-------------+-------------+-------------+-------------+-------------+-------------+------------+
| Sydney | 0 | 0 | 0 | 0 | 0 | NaN | NaN|
+----------+-------------+-------------+-------------+-------------+-------------+-------------+------------+
在PySpark中,我相信^{
正在创建示例数据集:
您可以应用过滤器功能
结果:
相关问题 更多 >
编程相关推荐