仅在末尾获取数字（正则表达式）

网友
1楼 · 编辑于 2024-06-18 13:10:29

在您尝试的模式中，此部分(?<=\s)(\d*\s*)匹配可选数字，后跟可选空格字符，而左侧必须直接有空格字符
这还将获得字符串中左侧有whitspace字符的所有位置，因为匹配中的数字和空白字符是可选的
在这部分(\d*\.\d*)$中，数字是可选的，因此它也可以只匹配字符串末尾的一个点
如果结尾的数字前必须有空格字符，可以使用：
(?<=\s)\d{1,3}(?:\.\d{3})*$
模式匹配：
(?<=\s)正向查找，在当前位置左侧断言一个whitspace字符
\d{1,3}匹配1-3个数字
(?:\.\d{3})*可选地重复一个点和3个数字
$字符串的结尾
见a regex demo
如果数字本身也可以是，则可以在左侧(?<!\S)断言一个空白边界
(?<!\S)\d{1,3}(?:\.\d{3})*$
另见regex demo
例如，使用str.extract并将模式包装到捕获组中：
import pandas as pd strings = [ "VISTA AES TIETE E UNT N2 600", "VISTA IT AUUNIBANCO PN N1 1.400", "OPCAO DE VENDA 04/21 COGNP450ON 4,50COGNE 100.000" ] df = pd.DataFrame(strings, columns=["colName"]) df['lastNumbers'] = df['colName'].str.extract(r"(?<=\s)(\d{1,3}(?:\.\d{3})*)$") print(df)
输出
colName lastNumbers 0 VISTA AES TIETE E UNT N2 600 600 1 VISTA IT AUUNIBANCO PN N1 1.400 1.400 2 OPCAO DE VENDA 04/21 COGNP450ON 4,50COGNE 100.000 100.000

网友
2楼 · 编辑于 2024-06-18 13:10:29

实际上，对于您的用例，我认为您甚至不需要regex
您只需拆分字符串并取最后一个，然后用空字符串替换点
如果是数据帧（因为您已经标记了Pandas）
> df['colName'].str.split().str[-1].str.replace('.', '') 0 600 1 1400 2 100000 Name: colName, dtype: object
如果是字符串列表
> list(map(lambda x: x.replace('.', ''),map(lambda x: x.split()[-1], data))) ['600', '1400', '100000']

网友
3楼 · 编辑于 2024-06-18 13:10:29

l = ["VISTA AES TIETE E UNT N2 600",
"VISTA IT AUUNIBANCO PN N1 1.400",
"OPCAO DE VENDA 04/21 COGNP450ON 4,50COGNE 100.000"]

如果数据是数据帧的形式

df=DataFrame({
    'col':l
})
df.col.str.extract('(\d*\.*\d*)?$').astype(str).replace('\.','', regex=True)

输出

0   600
1   1400
2   100000

相关问题更多 >

编程相关推荐

热门问题

热门文章