呼呼有多快?

2024-10-01 15:41:47 发布

您现在位置:Python中文网/ 问答频道 /正文

Whoosh是一个用纯Python(official website)实现的快速、功能强大的全文索引和搜索库。在

但我找不到与其他搜索引擎,尤其是基于Lucene的搜索引擎(pyLucene,Lupyne…)相比的速度/性能?在

我习惯于使用pyLucene,这是众所周知的快速,但相当非python和不容易处理(直接javalucene包装器)。有一个pyLucene的Python皮;Lupyne。然而,当需要Lucene的核心特性时,这并不方便。在

任何性能提示之间的呼啸和其他将不胜感激。在


Tags: 核心website特性性能速度搜索引擎officiallucene
1条回答
网友
1楼 · 发布于 2024-10-01 15:41:47

{1}Whoosh对Xappy/Xapian

有一些基准测试Whoosh和Xappy/Xapianhere支持的Python搜索。在

Whoosh的作者使用这些基准测试了Whoosh与Xappy/Xapian(ref)

基准的工作原理

N documents are generated, the search word is a random word and 10 chars long, plus 10 extra fields with 100 chars of random stuff each (just to pump up the size of the document).

For indexing, all fields are indexed and stored.

For searching, all words are searched in random order and all stored fields are retrieved.

For whoosh, we used the multiprocessing writer for building the index - this explains why it is faster for indexing than xappy (because it used all 4 cores, not just 1).

For searching, xappy/xapian is faster (there was no parallel processing used). But you see that the speed difference between xappy and whoosh is maybe not as big as you expected.

索引大小约12MB

# Phenom II X4 840, 8GB RAM, HDD
# Python 2.7.2+ (default, Oct  4 2011, 20:06:09) 
# [GCC 4.6.1] on linux2

Params:
DOC_COUNT: 3000 WORD_LEN: 10
EXTRA_FIELD_COUNT: 10 EXTRA_FIELD_LEN: 100

Benchmarking: xappy 0.5 / xapian 1.2.5
Indexing takes 2.8s (1068.9/s)
Searching takes 0.5s (6635.8/s)

Benchmarking: whoosh 2.3.2
Indexing takes 0.8s (3575.6/s)
Searching takes 0.8s (3714.8/s)

相关问题 更多 >

    热门问题