我们如何比较两个trie的相似性?

2024-09-26 21:01:14 发布

您现在位置:Python中文网/ 问答频道 /正文

我只是好奇,是否有一种方法可以比较两种数据结构的相似性

trie1                      trie2

   root                     root 
/     |                   /   |
m     b                   m   b
|     |                   |   |
a     o                   a   o
| \   |                   |   |
t  x  b                   x   b

def compare_trie(trie1, trie2):
    pass

Output["max","bob"]

编辑:到目前为止,我试图实现dfs算法,但突然想到如何管理两个不同尝试的堆栈

我尝试过的代码仍然通过为两次不同的尝试管理两个堆栈来实现:

def compareTrie(trie1, trie2):
    dfsStack = []
    result = []
    stack1 = [x for x in trie1.keys()]
    stack2 = [y for y in trie2.keys()]
    similar = list(set(stack1) & set(stack2))
    dfsStack.append((similar, result))
    while (dfsStack):
        current, result = dfsStack.pop()
        print(current, result)
        result.append(current)
        for c in current:
            trie1 = trie1[c]
            trie2 = trie2[c]
            st1 = [x for x in trie1.keys()]
            st2 = [x for x in trie2.keys()]
            simm = list(set(st1) & set(st2))
            dfsStack.append((simm, result))

    print(result)

Trie实施:

def create_trie(words):
    trie = {}
    for word in words:
        curr = trie
        for c in word:
            if c not in curr:
                curr[c] = {}
            curr = curr[c]
        # Mark the end of a word
        curr['#'] = True
    return trie


s1 = "mat max bob"
s2 = "max bob"

words1 = s1.split()
words2 = s2.split()

t1 = create_trie(words1)
t2 = create_trie(words2)

Tags: infordefresultcurrentkeysmaxbob
3条回答

是的,有可能。大多数事情都是这样的,因为你在一个缺少图灵机器的设备上使用的是通用语言

要做到这一点,最简单的方法是遍历每个trie,生成其所有键的集合。取两组的交点

对于当前状态,可以使用一个堆栈替换递归。并在compare方法中创建result数组

def compare(trie1, trie2):
    result = []
    stack = [(trie1, trie2, "")]
    while stack:
        t1, t2, curr = stack.pop()
        for i in t1:
            if i not in t2:
                continue
            if i == "#":
                result.append(curr)
            else:
                stack.append((t1[i], t2[i], curr + i))
    return result

您使用dfs的想法是正确的;但是,您可以选择一种简单的递归方法来解决手头的任务。以下是递归版本:

def create_trie(words):
    trie = {}
    for word in words:
        curr = trie
        for c in word:
            if c not in curr:
                curr[c] = {}
            curr = curr[c]
        # Mark the end of a word
        curr['#'] = True
    return trie

def compare(trie1, trie2, curr):
    for i in trie1.keys():
        if trie2.get(i, None):
            if i=="#":
                result.append(curr)
            else:
                compare(trie1[i], trie2[i], curr+i)
    

s1 = "mat max bob temp2 fg f r"
s2 = "max bob temp fg r c"

words1 = s1.split()
words2 = s2.split()

t1 = create_trie(words1)
t2 = create_trie(words2)
result = []
compare(t1, t2, "")
print(result)   #['max', 'bob', 'fg', 'r']

相关问题 更多 >

    热门问题