如果子字符串完全包含列表中另一个字符串的子字符串,如何在列表中找到匹配的子字符串?

2024-09-30 10:39:41 发布

您现在位置:Python中文网/ 问答频道 /正文

以下是两个列表:

list1 = ['apple pie', 'apple cake', 'the apple pie', 'the apple cake', 'apple']

list2 = ['apple', 'lots of apple', 'here is an apple', 'humungous apple', 'carrot cake']

我尝试过一个名为longest Substring finder的算法,但顾名思义,它并没有返回我想要的结果。你知道吗

def longestSubstringFinder(string1, string2):
    answer = "NULL"
    len1, len2 = len(string1), len(string2)
    for i in range(len1):
        match = ""
        for j in range(len2):
            if (i + j < len1 and string1[i + j] == string2[j]):
                match += string2[j]
            else:
                if (len(match) > len(answer)): answer = match
                match = ""
    return answer


mylist = []

def call():
    for i in file_names_short:
        s1 = i
        for j in company_list:
            s2 = j
            s1 = s1.lower()
            s2 = s2.lower()
            while(longestSubstringFinder(s2,s1) != "NULL"):
                x = longestSubstringFinder(s2,s1)
                # print(x)
                mylist.append(x)
                s2 = s2.replace(x, ' ')

call()
print('[%s]' % ','.join(map(str, mylist)))

预期输出应为:

output = ['apple', 'apple', 'apple', 'apple', '']

单词apple并不总是固定为apple,它是一个包含许多单词的较大列表,但我总是在两个列表中寻找匹配的单词,apple总是list1中最长的单词

另一个例子(可能更清楚):

string1 = ['Walgreens & Co.', 'Amazon Inc''] 
string2 = ['walgreens customers', 'amazon products', 'other words'] 
output = ['walgreens', 'amazon', ''] 

Tags: answerinapple列表forlenmatch单词
1条回答
网友
1楼 · 发布于 2024-09-30 10:39:41

编辑:编辑以获得最长的匹配

list1 = ['apple pie cucumber', 'apple cake', 'the apple pie', 'the apple cake', 'apple']
list2 = ['apple cucumber', 'lots of apple', 'here is an apple', 'humungous apple', 'carrot cake']

result = []

for i in range(len(list1)):
    match = []
    words1, words2 = list1[i].split(), list2[i].split()
    for w in words1:
        if w in words2:
            match.append(w)

    longest = max(match, key=lambda x: len(x)) if match else ''
    result.append(longest)

print(result)

输出:

['cucumber', 'apple', 'apple', 'apple', '']

相关问题 更多 >

    热门问题