We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
id 字段使用 ik 分词,对 于 xiaoming666888 在 ik_max_word 分词器下分词效果如下: $curl "http://localhost/_analyze?analyzer=ik_max_word&pretty&text=xiaoming666888"
{ "tokens" : [ { "token" : "xiaoming666888", "start_offset" : 0, "end_offset" : 14, "type" : "LETTER", "position" : 0 }, { "token" : "xiaoming", "start_offset" : 0, "end_offset" : 8, "type" : "ENGLISH", "position" : 1 }, { "token" : "666888", "start_offset" : 8, "end_offset" : 14, "type" : "ARABIC", "position" : 2 } ] }
对于查询请求:
POST /test/_search { "filter" : { "and" : [ { "term" : { "id" : "xiaoming" } }, { "term":{"id":"666888"} } ] }}
上面是可以召回 xiaoming666888 但下面的查询却不能:
GET /test/_search?q=id:'xiaoming666888'&default_operator=AND&analyzer=ik_max_word
ik_max_word 不应该将 id 拆分成 xiaoming 与 666888 通过 and 进行过滤 么? 请问上面两个查询 语句在 ES 里是否有区别 ?
The text was updated successfully, but these errors were encountered:
这个我也遇到过,数字和拼音或汉字貌似是会被分开的,加了特定分词也不行,仍在研究如何解决。 或者把xiaoming666888作为短语类型搜索下看看
Sorry, something went wrong.
@nathan-zhu 英文和阿拉伯数字以及汉字都是属于不同的书写体,有的分词器,比如 icu ,startdard 都是基于 uncode 来切分问题,所以是不是和书写体有关?
我在本地没有重现呢
No branches or pull requests
id 字段使用 ik 分词,对 于 xiaoming666888 在 ik_max_word 分词器下分词效果如下:
$curl "http://localhost/_analyze?analyzer=ik_max_word&pretty&text=xiaoming666888"
对于查询请求:
上面是可以召回 xiaoming666888 但下面的查询却不能:
ik_max_word 不应该将 id 拆分成 xiaoming 与 666888 通过 and 进行过滤 么? 请问上面两个查询 语句在 ES 里是否有区别 ?
The text was updated successfully, but these errors were encountered: