使用tri不带空格分隔的字符串

2024-10-01 22:31:58 发布

您现在位置:Python中文网/ 问答频道 /正文

我仍然很难理解算法中的所有步骤,即如何在字典的基础上对输入字符串进行所有可能的分区(没有空格)。你知道吗

我读到我们必须使用TRIE结构,使用一些带有回溯(深度优先搜索)的动态编程,但我不理解它在概念上是如何工作的。你知道吗

首先我想要的是:

input 
string = `theology`

    dictionnary 
    dict = [the , theology]

output i am expecting

[the, o, l, o, g, y ] ,[theology]

关于我的“理解不好,这是我看到的步骤”(我使用R语言,但我更需要理解逻辑,而不是有一个具体的语言答案)

0 - Transform my dictionnary into a trie 

```
library("triebeard")
trie <- trie(keys = c("the", "theology","toto","what"),
             values = c("the", "theology",
             "toto","what"))
```

1 - calculate the number of character on the string 
    - On this case I have 8.

2 - Take character by character n+1 and look inside the trie

2.1 I take the first character is t so i look  at t and do the longest match 
    It return NA so i go to the next character
    ```
    longest_match(trie, "t")
    ```
    2.2 Next char `"th"` returns NA 

2.3 Next char "the" returns `"the"` so i add it the a list 

3a - Now i don't know what to do should I continue to the 4 position take one character
look at the longest match see NA and so on so forth 

3b - Or should go and look at  `"theo"`

我想这和前缀匹配而不是最长匹配有关。你知道吗

如果您能指导我如何使用trie获得预期的输出

[更新]

为了更好地理解我的问题,这里有一个可能的解决方案,在python中使用dictionary,我发现/修改了interjay的答案:https://stackoverflow.com/a/2174156/5348895

wordList = ['the','theology'] words = set( s.lower() for s in wordList ) def splitString(s): found = [] def rec(stringLeft, wordsSoFar): if not stringLeft: found.append(wordsSoFar) for pos in xrange(1, len(stringLeft)+1): if stringLeft[:pos] in words: rec(stringLeft[pos:], wordsSoFar + [stringLeft[:pos]]) elif (pos==len(stringLeft)) and stringLeft[:pos] not in wordList and len(stringLeft)<len(s): rec(stringLeft[1:], wordsSoFar + [stringLeft[:1]]) rec(s.lower(), []) return found output = splitString('theology') for x in output: print ','.join(x)

谢谢


Tags: andtheinposoutputlensowhat

热门问题