<p>这里有几个问题</p>
<ol>
<li>您必须删除行,否则会得到linefeed/CR字符,使匹配失败</li>
<li>您必须一次性地读取文件,否则文件迭代器将在第一次读取后耗尽</li>
<li>速度不好:使用<code>set</code>而不是<code>list</code>加快搜索速度</li>
<li>切片过于复杂和错误:<code>str[1:-1]</code>做到了(感谢那些评论我答案的人)</li>
<li>整个代码实在太长太复杂了。我总结了几行。你知道吗</li>
</ol>
<p>代码:</p>
<pre><code>#read in textfile
myFile = open('good_words.txt')
# make a set (faster search), remove linefeeds
lines = set(x.strip() for x in myFile)
myFile.close()
# iterate on the lines
for word in lines:
#only consider strings that are greater than length 3
if len(word) >= 4:
modifiedStr = word[1:-1][::-1] #do string modification
if modifiedStr in lines:
print(modifiedStr + " found (was "+word+")")
else:
print(modifiedStr + " not found")
</code></pre>
<p>我在一个普通英语单词列表上测试了这个程序,得到了以下匹配结果:</p>
<pre><code>so found (was most)
or found (was from)
no found (was long)
on found (was know)
to found (was both)
</code></pre>
<p>Edit:另一个版本,它删除<code>set</code>并在排序列表上使用<code>bisect</code>,以避免散列/散列冲突。你知道吗</p>
<pre><code>import os,bisect
#read in textfile
myFile = open("good_words.txt"))
lines = sorted(x.strip() for x in myFile) # make a sorted list, remove linefeeds
myFile.close()
result=[]
for word in lines:
#only modify strings that are greater than length 3
if len(word) >= 4:
modifiedStr = word[1:-1][::-1] #do string modification
# search where to insert the modified word
i=bisect.bisect_left(lines,modifiedStr)
# if can be inserted and word is actually at this position: found
if i<len(lines) and lines[i]==modifiedStr:
print(modifiedStr + " found (was "+word+")")
else:
print(modifiedStr + " not found")
</code></pre>