Elastic找不到带撇号(')的单词

2024-09-30 05:31:43 发布

您现在位置:Python中文网/ 问答频道 /正文

我试着找出那个由撇号组成的句子。所以,在课文中

If you ask foreigners to name some typically English dishes, they will probably say fish and chips and then stop. It is disappointing, but true, that there is no tradition in Britain of eating in restaurants, because our food doesn't lend itself to such preparation. British cooking is found in the home, where it is possible to time the dishes to perfection. So it is difficult to find a good English restaurant with reasonable prices

我试着找到

find it is disappointing, but true, that there is no tradition in britain of eating in restaurants, because our food doesn't

我创建查询

{
"_index": "liza_index",
"_type": ".percolator",
"_id": "1594",
"_version": 37,
"found": true,
"_source": {
    "query": {
        "bool": {
            "minimum_should_match": 1,
            "should": {
                "span_or": {
                    "clauses": [{
                        "span_near": {
                            "in_order": true,
                            "clauses": [{
                                "span_multi": {
                                    "match": {
                                        "regexp": {
                                            "message": "it"
                                        }
                                    }
                                }
                            }, {
                                "span_multi": {
                                    "match": {
                                        "regexp": {
                                            "message": "is"
                                        }
                                    }
                                }
                            }, {
                                "span_multi": {
                                    "match": {
                                        "regexp": {
                                            "message": "disappointing"
                                        }
                                    }
                                }
                            }, {
                                "span_multi": {
                                    "match": {
                                        "regexp": {
                                            "message": "but"
                                        }
                                    }
                                }
                            }, {
                                "span_multi": {
                                    "match": {
                                        "regexp": {
                                            "message": "true"
                                        }
                                    }
                                }
                            }, {
                                "span_multi": {
                                    "match": {
                                        "regexp": {
                                            "message": "that"
                                        }
                                    }
                                }
                            }, {
                                "span_multi": {
                                    "match": {
                                        "regexp": {
                                            "message": "there"
                                        }
                                    }
                                }
                            }, {
                                "span_multi": {
                                    "match": {
                                        "regexp": {
                                            "message": "is"
                                        }
                                    }
                                }
                            }, {
                                "span_multi": {
                                    "match": {
                                        "regexp": {
                                            "message": "no"
                                        }
                                    }
                                }
                            }, {
                                "span_multi": {
                                    "match": {
                                        "regexp": {
                                            "message": "tradition"
                                        }
                                    }
                                }
                            }, {
                                "span_multi": {
                                    "match": {
                                        "regexp": {
                                            "message": "in"
                                        }
                                    }
                                }
                            }, {
                                "span_multi": {
                                    "match": {
                                        "regexp": {
                                            "message": "britain"
                                        }
                                    }
                                }
                            }, {
                                "span_multi": {
                                    "match": {
                                        "regexp": {
                                            "message": "of"
                                        }
                                    }
                                }
                            }, {
                                "span_multi": {
                                    "match": {
                                        "regexp": {
                                            "message": "eating"
                                        }
                                    }
                                }
                            }, {
                                "span_multi": {
                                    "match": {
                                        "regexp": {
                                            "message": "in"
                                        }
                                    }
                                }
                            }, {
                                "span_multi": {
                                    "match": {
                                        "regexp": {
                                            "message": "restaurants"
                                        }
                                    }
                                }
                            }, {
                                "span_multi": {
                                    "match": {
                                        "regexp": {
                                            "message": "because"
                                        }
                                    }
                                }
                            }, {
                                "span_multi": {
                                    "match": {
                                        "regexp": {
                                            "message": "our"
                                        }
                                    }
                                }
                            }, {
                                "span_multi": {
                                    "match": {
                                        "regexp": {
                                            "message": "food"
                                        }
                                    }
                                }
                            }, {
                                "span_multi": {
                                    "match": {
                                        "regexp": {
                                            "message": "doesn't"
                                        }
                                    }
                                }
                            }],
                            "slop": 0,
                            "collect_payloads": false
                        }
                    }]
                }
            }
        }
    }
}}

但橡皮筋找不到。如果没有“does not”,查询就会工作。你知道吗

我试着在撇号前面加上反斜杠-“does\\\'t”是无效的,所以我做了“does\\\'t”和“does\\\'t”。但它不起作用。你知道吗

顺便说一下,我创建了一个带有一个带反斜杠和不带反斜杠的单词“does't”的查询

{
"_index": "liza_index",
"_type": ".percolator",
"_id": "2101",
"_version": 31,
"found": true,
"_source": {
    "query": {
        "bool": {
            "minimum_should_match": 1,
            "should": {
                "span_or": {
                    "clauses": [{
                        "span_multi": {
                            "match": {
                                "regexp": {
                                    "message": "doesn't"
                                }
                            }
                        }
                    }]
                }
            }
        }
    }
}}

而且也不管用。同时,以下查询工作

curl -XPUT 'localhost:9200/liza_index/.percolator/1' -d '{"query" : {"match" : {"message" : "doesn't"}}}'

以及

curl -XPUT 'localhost:9200/liza_index/.percolator/1' -d '{"query" : {"match" : {"message" : "doesn\\'t"}}}'

问题是:我怎样才能找到带撇号的单词?我应该使用第一个查询的结构创建什么样的查询?你知道吗


Tags: tointruemessageindexismatchit
1条回答
网友
1楼 · 发布于 2024-09-30 05:31:43

“The hardest thing of all is to find a black cat in a dark room, especially if there is no cat.”

― Confucius

Elasticsearch执行它接收的数据的standard analysis and curation。它删除了大多数标点符号。如果您使用匹配查询,您的查询将通过相同的管理过程,并且它将起作用(查询中的所有标点将被删除)。不管理Regexp查询。这就是为什么它找不到撇号。你知道吗

您可以使用match_phrase

curl -XPOST "http://esarchive.local:9200/liza_index/.percolator/_search" -d'
{
  "query": {
    "match_phrase": {
      "message": "find it is disappointing, but true, that there is no tradition in britain of eating in restaurants, because our food doesn\"t"
    }
  }
}'

或创建自定义分析器/映射器以保留puctuation

相关问题 更多 >

    热门问题