在单个单词字符串PYTHON中查找数字字符

2024-10-01 02:25:36 发布

您现在位置:Python中文网/ 问答频道 /正文

在CSV的文本字段中有多种值

有些值是这样的 AGM00鲍德温 AGM00BOUCK公司

但是,有些是重复的,将名称改为 AGM00BOUCK01型 AGM00COBDEN01型 AGM00COBDEN02型

我的目标是为不包含数字后缀的值编写一个特定的ID

这是目前为止的代码

prov_count = 3000
prov_ID = 0
items = (name, x, y)
xy_tup = tuple(items)

if "*1" not in name and "*2" not in name:
    prov_ID = prov_count + 1
else:
prov_ID = ""

看起来通配符在这里不是合适的方法,但我似乎找不到合适的解决方案。你知道吗


Tags: csvnamein文本名称idcountnot
3条回答

有不同的方法,一种是使用isdigit函数:

a = ["AGM00BALDWIN", "AGM00BOUCK", "AGM00BOUCK01", "AGM00COBDEN01", "AGM00COBDEN02"]

for i in a:
  if i[-1].isdigit():  # can use i[-1] and i[-2] for both numbers
    print (i)


使用regex
import re
a = ["AGM00BALDWIN", "AGM00BOUCK", "AGM00BOUCK01", "AGM00COBDEN01", "AGM00COBDEN02"]

pat = re.compile(r"^.*\d$")  # can use "\d\d" instead of "\d" for 2 numbers
for i in a:
  if pat.match(i): print (i)

另一个:

for i in a:
    if name[-1:] in map(str, range(10)): print (i)

以上所有方法都返回带有数字后缀的输入:

AGM00BOUCK01
AGM00COBDEN01
AGM00COBDEN02

可以使用切片查找元素的最后2个字符,然后检查它是否以'01''02'结尾:

l = ["AGM00BALDWIN", "AGM00BOUCK", "AGM00BOUCK01", "AGM00COBDEN01", "AGM00COBDEN02"]

for i in l:
    if i[-2:] in ('01', '02'):
        print('{} is a duplicate'.format(i))

输出:

AGM00BOUCK01 is a duplicate
AGM00COBDEN01 is a duplicate
AGM00COBDEN02 is a duplicate

或者另一种方法是使用str.endswith方法:

l = ["AGM00BALDWIN", "AGM00BOUCK", "AGM00BOUCK01", "AGM00COBDEN01", "AGM00COBDEN02"]

for i in l:
    if i.endswith('01') or i.endswith('02'):
        print('{} is a duplicate'.format(i))

所以您的代码如下所示:

prov_count = 3000
prov_ID = 0
items = (name, x, y)
xy_tup = tuple(items)

if name[-2] in ('01', '02'):
    prov_ID = prov_count + 1
else:
    prov_ID = ""

在这里使用正则表达式似乎很合适:

import re

pattern= re.compile(r'(\d+$)')

prov_count = 3000
prov_ID = 0
items = (name, x, y)
xy_tup = tuple(items)

if pattern.match(name)==False:
    prov_ID = prov_count + 1
else:
    prov_ID = ""

相关问题 更多 >