使用字符串列表筛选数据帧

pop_df Out[33]: 2014 2015 2016 2017 2018 2019 Geo AL 2892394.0 2885796.0 2875592.0 2876591.0 2870324.0 2862427.0 AL0 2892394.0 2885796.0 2875592.0 2876591.0 2870324.0 2862427.0 AL01 844921.0 836448.0 830981.0 826904.0 819793.0 813758.0 AL011 134332.0 131054.0 129056.0 125579.0 120978.0 118948.0 AL012 276058.0 277989.0 280205.0 284823.0 289626.0 290126.0 ... ... ... ... ... ... UKN12 142028.0 142756.0 143363.0 143746.0 144105.0 144367.0 UKN13 139774.0 140222.0 140752.0 141368.0 141994.0 142565.0 UKN14 137722.0 139426.0 140691.0 141917.0 143286.0 144771.0 UKN15 136332.0 136904.0 137492.0 138000.0 138441.0 138948.0 UKN16 114696.0 115171.0 115581.0 116057.0 116612.0 117051.0 [2034 rows x 6 columns]

2条回答

网友

1楼 · 编辑于 2024-09-28 03:19:20

我建议您使用最多两个首字母的切片索引（作为字符串），并使用pandas.isin method作为布尔掩码应用于国家代码变量：

eu_countries_filtered = pop_df[pop_df.index.str[:2].isin(EuropeanUnion)]

网友

2楼 · 编辑于 2024-09-28 03:19:20

似乎Geo是索引，因此您可以执行以下操作：

result = df[df.index.str.match(rf'\b{"|".join(EuropeanUnion)}')]

输出（虚拟）

           2014      2015      2016      2017      2018      2019
Geo                                                              
BE011  134332.0  131054.0  129056.0  125579.0  120978.0  118948.0
DE13   139774.0  140222.0  140752.0  141368.0  141994.0  142565.0

从关于str.match的文件中：

Determine if each string starts with a match of a regular expression.

表达式rf'\b{"|".join(EuropeanUnion)}'构建一个正则表达式模式，该模式将匹配任何国家代码

相关问题更多 >

编程相关推荐

热门问题

热门文章