如何使用荧光笔?

2024-05-18 16:16:33 发布

您现在位置:Python中文网/ 问答频道 /正文

我读过一些关于在Lucene中突出显示搜索词的教程,并想出了一段如下代码:

(...)
query = parser.parse(query_string)

for scoreDoc in  searcher.search(query, 50).scoreDocs:
    doc = searcher.doc(scoreDoc.doc)
    filename = doc.get("filename")
    print filename
    found_paraghaph = fetch_from_my_text_library(filename)

    stream = lucene.TokenSources.getTokenStream("contents", found_paraghaph, analyzer);
    scorer = lucene.Scorer(query, "contents", lucene.CachingTokenFilter(stream))
    highligter = lucene.Highligter(scorer)
    fragment = highligter.getBestFragment(analyzer, "contents", found_paraghaph)
    print '>>>' + fragment

但一切都以一个错误告终:

^{pr2}$

所以,我想,这部分Lucene还没有在pyLucene实现。有别的办法吗?在


Tags: streamdoccontentsfilenamequeryanalyzerprintsearcher
1条回答
网友
1楼 · 发布于 2024-05-18 16:16:33

我也有类似的错误。我认为这个类的包装器还没有为pylucenev3.6实现。在

您可能需要尝试以下操作:

analyzer = StandardAnalyzer(Version.LUCENE_CURRENT)

# Constructs a query parser.
queryParser = QueryParser(Version.LUCENE_CURRENT, FIELD_CONTENTS, analyzer)

# Create a query
query = queryParser.parse(QUERY_STRING)

topDocs = searcher.search(query, 50)

# Get top hits
scoreDocs = topDocs.scoreDocs
print "%s total matching documents." % len(scoreDocs)

HighlightFormatter = SimpleHTMLFormatter();
highlighter = Highlighter(HighlightFormatter, QueryScorer (query))

for scoreDoc in scoreDocs:
    doc = searcher.doc(scoreDoc.doc)
    text = doc.get(FIELD_CONTENTS)
    ts = analyzer.tokenStream(FIELD_CONTENTS, StringReader(text))
    print doc.get(FIELD_PATH)
    print highlighter.getBestFragments(ts, text, 3, "...")
    print ""

请注意,我们为搜索结果中的每个项目创建令牌流。在

相关问题 更多 >

    热门问题