有 Java 编程相关的问题?

你可以在下面搜索框中键入要查询的问题!

java无法识别Lucene MoreLikeThis中的错误

我需要使用Lucene MoreLikeThis来查找给定一段文本的类似文档。我是Lucene的新手,遵循代码here

我已经在目录“C:\Users\lucene\u index\u files\v2”中为文档编制了索引

我用的是“他们是计算机工程师,他们喜欢开发自己的工具。程序使用Java、CPP等语言。”作为我想找到类似文档的文档

 public class LuceneSearcher2 {


public static void main(String[] args) throws IOException {
    LuceneSearcher2 m = new LuceneSearcher2();
    System.out.println("1");
    m.start();
    System.out.println("2");
    //m.writerEntries();
    m.findSilimar("They are computer engineers and they like to develop their own tools. The program in languages like Java, CPP.");
    System.out.println("3");
}




private Directory indexDir;
private StandardAnalyzer analyzer;
private IndexWriterConfig config;

public void start() throws IOException{
    //analyzer = new StandardAnalyzer(Version.LUCENE_42);
    //config = new IndexWriterConfig(Version.LUCENE_42, analyzer);
    analyzer = new StandardAnalyzer();
    config = new IndexWriterConfig(analyzer);
    config.setOpenMode(OpenMode.CREATE_OR_APPEND);

    indexDir = new RAMDirectory(); //don't write on disk
    //https://stackoverflow.com/questions/36542551/lucene-in-java-method-not-found?rq=1
    indexDir = FSDirectory.open(FileSystems.getDefault().getPath("C:\\Users\\lucene_index_files\\v2")); //write on disk
    //System.out.println(indexDir);
}
private void findSilimar(String searchForSimilar) throws IOException {
    IndexReader reader = DirectoryReader.open(indexDir);
    IndexSearcher indexSearcher = new IndexSearcher(reader);

    System.out.println("2a");
    MoreLikeThis mlt = new MoreLikeThis(reader);
    mlt.setMinTermFreq(0);
    mlt.setMinDocFreq(0);
    mlt.setFieldNames(new String[]{"title", "content"});
    mlt.setAnalyzer(analyzer);
    System.out.println("2b");



    StringReader sReader = new StringReader(searchForSimilar);

    //Query query = mlt.like(sReader, null);
    //Throws error - The method like(String, Reader...) in the type MoreLikeThis is not applicable for the arguments (StringReader, null)

    Query query = mlt.like("computer");
    System.out.println("2c");
    System.out.println(query.toString());

    TopDocs topDocs = indexSearcher.search(query,10);

    for ( ScoreDoc scoreDoc : topDocs.scoreDocs ) {
        Document aSimilar = indexSearcher.doc( scoreDoc.doc );
        String similarTitle = aSimilar.get("title");
        String similarContent = aSimilar.get("content");

        System.out.println("====similar finded====");
        System.out.println("title: "+ similarTitle);
        System.out.println("content: "+ similarContent);
    }
    System.out.println("2d");

}}

我不确定是什么原因导致系统无法生成输出/


共 (1) 个答案

  1. # 1 楼答案

    你的产出是多少?我想你没有找到类似的文件。原因可能是您正在创建的查询为空

    首先,以一种有意义的方式运行代码

    Query query = mlt.like(sReader, null); 
    

    需要一个字段名的字符串[]作为参数,所以它应该是这样工作的

    Query query = mlt.like(sReader, new String[]{"title", "content"}); 
    

    现在,为了在Lucene中使用更多类似的功能,您存储的字段必须设置存储术语向量的选项“setStoreTermVectors(true);”创建字段时为true,例如:

    FieldType fieldType = new FieldType();
    fieldType.setStored(true);
    fieldType.setStoreTermVectors(true);
    fieldType.setTokenized(true);
    Field contentField = new Field("contents", this.getBlurb(), fieldType);
    doc.add(contentField);
    

    忽略此选项可能会导致查询字符串为空,从而导致查询没有结果