将具有一定长度的公共子字符串的字符串中的单词组合在一起

2条回答

网友

1楼 · 编辑于 2024-06-28 20:44:45

这里有一个解决方案

str_ = "the games are lame"

# first I get a list of all the words
words = str_.split()
# words >>> ['the', 'games', 'are', 'lame']

groups = []
# This variable will contain the list of words

# For each words
for word in words:
    found = False

    # Get the first words of each groups
    other_words = [x[0] for x in groups if x != word]

    # Loop through the word and get all substring of 3 characters
    for i in range(len(word)):
        substring = word[i:i+3]

        # Eliminates the substring that doesn't have the correct length
        if len(substring) != 3:
            continue

        try:
            # try to find the substring in a group and get the corresponding index of that group
            index = [substring in other_word for other_word in other_words].index(True)
            found = True

            # Add the word in the group
            groups[index].append(word)
        except ValueError:
            continue

    # If we don't find a group for the word, we create a new group with that word in it
    if not found:
        groups.append([word])


# groups >>> [['the'], ['games', 'lame'], ['are']]

# Now print the groups
for group in groups:
    print(", ".join(group))

输出：

the
games, lame
are

网友

2楼 · 编辑于 2024-06-28 20:44:45

我认为你在那里创建了很多列表，这可能会让人很困惑

如果您想使用纯逻辑方法，而不使用为序列匹配设计的库，例如difflib，您可以首先定义一个比较两个字符串的函数；然后你把你的句子分成一个单词列表，并通过这个列表进行双重迭代（嵌套），比较所有可能的单词对

如果字符串匹配，它们将打印在同一行上，以逗号分隔，否则打印在新行上

在以下函数中，我还为要匹配的子字符串的长度添加了一个参数，默认情况下设置为3以与您的问题保持一致：

# This function compairs two strings and returns them in a tuple if they contain the 
# same substring of len_substring characters.

def string_matcher(string_a, string_b, len_substring = 3):
    for i in range(len(string_a)-len_substring):
        if string_a[i:i+len_substring] in string_b:
            return string_a, string_b
    return None

string = "the games are lame"
words = string.split()

output = ""

# Making a double iteration over the words list and calling string_matcher for each pair.
for i in range(len(words)-1):
    output = output+words[i]
    for j in range(i+1, len(words)):
        try:
            word_a, word_b = string_matcher(words[i], words[j])
            output = output+", "+word_b
        except TypeError:
            pass
    output = output+"\n"

print(output)

程序将打印出：

the
games, lame
are

相关问题更多 >

编程相关推荐

热门问题

热门文章

将具有一定长度的公共子字符串的字符串中的单词组合在一起

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >