用regex搜索产品名称和产品描述crom CSV中的筛选器volts我想做的是从搜索中删除重复的值。 我已经尝试设置列表等我很难理解为什么我不能删除重复的词从我的搜索。由于不了解集合是如何工作的,它似乎把所有的值分成了字符1,2,v,o,l,t,一个人不能在found中删除整个重复的单词吗?当我运行代码时,我得到:
12 Volt
12 Volt
40 Volt
2 Volt
18 Volt
18 Volt
240 Volt
240 Volt
110 Volt
110 Volt
110 Volt
36 Volt
我需要并努力实现的是独特的价值清单,即12伏,40伏,18伏,240伏等
def volts_search():
with open('filters/volts_filter.csv', 'w') as headerOut:
headerOut.write("name" + "," + "sort_order" + "," + "status" + "," + "image" + "," + "regex" + "," + "value" + "\n")
with open(merchant_feed, 'r') as csv_filein, open('filters/volts_filter.csv', 'a') as fileOut:
reader = csv.DictReader(csv_filein, delimiter=',', quotechar='"')
for row in reader:
program_name = clean_text(row['program_name'])
product_name = clean_text(row['product_name'])
product_description = clean_text(row['description'])
merchant_category = clean_text(row['merchant_category'])
product_id = row['product_id']
product_brand = clean_text(row['brand'])
filter_name = "Filter By Volts:"
v = re.findall(r"((?i)(?:)\d+\.\d+v|\d+\.\d+ v|\d+ v|\d+v)", product_name + product_description)
volt = re.findall(r"((?i)(?:)\d+volt|\d+ volt)", product_name + product_description)
volts = re.findall(r"((?i)(?:)\d+\.\d+volts|\d+volts)", product_name + product_description)
seen = set()
for filter_search in volt:
if filter_search in product_name + product_description:
if filter_search in seen: continue
seen.add(filter_search)
print(filter_search)
正则表达式
This expression可能有助于使用字符串替换删除CSV文件中的重复输入:
图形
此图显示了通过反向引用它的工作方式:
相关问题 更多 >
编程相关推荐