<p>关于<em>您如何准确地执行关键字搜索还有一些问题。您的示例中已经包含了一个障碍:如何处理诸如逗号之类的字符?另外,不清楚如何处理不包含关键字的行。另外,如果关键字前后没有两个单词,该怎么办?我猜你自己有点不确定的确切要求,并没有考虑所有的边缘案件。你知道吗</p>
<p>尽管如此,我还是对这些问题做了一些“盲目的决定”,下面是一个简单的示例实现,它假设关键字匹配规则非常简单。我已经创建了函数<code>findword()</code>,您可以根据需要调整它。所以,也许这个例子可以帮助你找到自己的需求。你知道吗</p>
<pre><code>KEYWORD = "lists"
S = """12088|CITA|{Hello very nice lists, better to keep those
12089|CITA|This is great theme for lists keep it """
def findword(words, keyword):
"""Return index of first occurrence of `keyword` in sequence
`words`, otherwise return None.
The current implementation searches for "keyword" as well as
for "keyword," (with trailing comma).
"""
for test in (keyword, "%s," % keyword):
try:
return words.index(test)
except ValueError:
pass
return None
for line in S.splitlines():
tokens = line.split("|")
words = tokens[2].split()
idx = findword(words, KEYWORD)
if idx is None:
# Keyword not found. Print line without change.
print line
continue
l = len(words)
start = idx-2 if idx > 1 else 0
end = idx+3 if idx < l-2 else -1
tokens[2] = " ".join(words[start:end])
print '|'.join(tokens)
</code></pre>
<p>测试:</p>
<pre><code>$ python test.py
12088|CITA|very nice lists, better to
12089|CITA|theme for lists keep it
</code></pre>
<p>PS:我希望我的指数适合切片。不过,你应该检查一下。你知道吗</p>