Whoosh Slop Operator Behaviou

2024-09-28 20:54:00 发布

您现在位置:Python中文网/ 问答频道 /正文

# Text: income tax expense resulting from the utilization of net operating loss carry forwards

尝试的查询格式:

q = QueryParser(u"content", ix.schema).parse(u"income utilization~3")
q = QueryParser(u"content", ix.schema).parse(u"'income utilization'~3")

slop操作符似乎不适用于我的用例。它不考虑上述格式中给出的斜率值。它总是返回结果,即使slop条件不满足。你能帮忙吗?你知道吗

输出:

 (content:income AND content:utilization)
 <Hit {'title': u'test'}>

完整代码段:

import os

from whoosh.fields import Schema, ID, TEXT
from whoosh.index import create_in, open_dir
from whoosh.qparser import QueryParser


schema = Schema(title=ID(stored=True), content=TEXT)

def setup():
    if not os.path.exists("indexdir"):
        os.makedirs("indexdir")

    ix = create_in("indexdir", schema)
    writer = ix.writer()
    writer.add_document(title=u"test", content=u"income tax expense resulting from the utilization of net operating loss carry forwards")
    writer.commit()

def fetch():
    ix = open_dir("indexdir")
    with ix.searcher() as searcher:
        q = QueryParser(u"content", ix.schema).parse(u"income utilization~3")
        print q
        results = searcher.search(q)
        for r in results:
            print r

if __name__ == '__main__':
    setup()
    fetch()

Tags: infromimporttitleparseosschemacontent
1条回答
网友
1楼 · 发布于 2024-09-28 20:54:00

您将模糊运算符与slop运算符混淆:

  1. 模糊运算符/编辑距离word~word~n,用于模糊术语的那些表示搜索编辑距离等于nword。你知道吗
  2. Slop操作符:"word1 word2 ... wordk"~n,这是用于Slop等于n的短语搜索。你知道吗

您应该尝试:

# "income utilization"~3
q = QueryParser(u"content", ix.schema).parse(u'"income utilization"~3') 

参考文献:

  1. Adding fuzzy term queries
  2. whoosh.qparser.PhrasePlugin

相关问题 更多 >