获取子字符串Python inside list elements Web Scraping

网友

1楼 · 编辑于 2024-09-28 20:47:34

一个选项（根据您提供的数据示例）可以是：

import re
strings = ['Fresh Value Colocasia 250g', 'Fresh Value Banana Robusta 1kg', 'Fresh Value Raw Papaya 1 U (units) (300g-400g)','Fresh Value Premium Pomegranate Kabul (500g - 700g)']
for i in strings:
    start = re.findall('\d|\(', i)[0]
    name = i.split(start)[0].strip()
    quantity = start + i.split(start)[1]
    print 'Name = '+ name + ', Quantity = ', quantity

输出：

^{pr2}$

当然，如果数字和括号只出现在数量中，而不在名称中，它是有效的。如果数量以其他符号开头，则可以将它们添加到findall

网友

2楼 · 编辑于 2024-09-28 20:47:34

import re
def substring(string):
    output = {}
    name = string.split()[0]
    for i in range(1,len(string.split())):
        if len(re.findall('\d', string.split()[i]))==0:
            name = name + " " + string.split()[i]
        else:
            quantity = " ".join(string.split()[i:])
            break
    output["Name"] = name
    output["Quantity"] = quantity
    return output

然后将字符串放入该函数中，如下所示：

^{pr2}$

你将得到：

{'Name'：'新鲜值生木瓜'，'数量'：'1U（单位）（300g-400g）}

网友

3楼 · 编辑于 2024-09-28 20:47:34

你也可以试试这个：

def split_unit(stri):
    to_split = re.findall("\\d+",stri)[0]
    splitted = to_split + stri.split(to_split,1)[1]
    print(splitted)

split_unit("Fresh Value Colocasia 250g") #outputs : 250 g
split_unit("Fresh Value Banana Robusta 1kg") #outputs : 1Kg
split_unit("Fresh Value Raw Papaya 1 U (units) (300g-400g)") # outputs:1 U 
                                     #(units) (300g-400g)

以此类推，我所做的是，首先在函数内部的第一行使用regex，找到字符串中第一个出现的intiger。和使用结构分裂（）方法拆分第一个整数后的所有字符，并将其与第一个整数的to_split合并。在

相关问题更多 >

编程相关推荐

热门问题

热门文章

获取子字符串Python inside list elements Web Scraping

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >