使用ik分词搜索 id:xiaoming666888 搜不到，但搜索id:xiaoming and id:666888 是可以搜索到的 #258

kubbo · 2016-08-09T05:39:52Z

id 字段使用 ik 分词，对于 xiaoming666888 在 ik_max_word 分词器下分词效果如下:
$curl "http://localhost/_analyze?analyzer=ik_max_word&pretty&text=xiaoming666888"

{
  "tokens" : [ {
    "token" : "xiaoming666888",
    "start_offset" : 0,
    "end_offset" : 14,
    "type" : "LETTER",
    "position" : 0
  }, {
    "token" : "xiaoming",
    "start_offset" : 0,
    "end_offset" : 8,
    "type" : "ENGLISH",
    "position" : 1
  }, {
    "token" : "666888",
    "start_offset" : 8,
    "end_offset" : 14,
    "type" : "ARABIC",
    "position" : 2
  } ]
}

对于查询请求：

POST /test/_search
{ "filter" : {
            "and" : [
                {
                    "term" : { "id" : "xiaoming" }
                },
                {
                    "term":{"id":"666888"}
                }
            ]
        }}

上面是可以召回 xiaoming666888 但下面的查询却不能：

GET /test/_search?q=id:'xiaoming666888'&default_operator=AND&analyzer=ik_max_word

ik_max_word 不应该将 id 拆分成 xiaoming 与 666888 通过 and 进行过滤么？请问上面两个查询语句在 ES 里是否有区别？

The text was updated successfully, but these errors were encountered:

nathan-zhu · 2016-08-15T02:51:04Z

这个我也遇到过，数字和拼音或汉字貌似是会被分开的，加了特定分词也不行，仍在研究如何解决。
或者把xiaoming666888作为短语类型搜索下看看

ScsUndefined · 2016-09-23T12:36:55Z

@nathan-zhu
英文和阿拉伯数字以及汉字都是属于不同的书写体，有的分词器，比如 icu ，startdard 都是基于 uncode 来切分问题，所以是不是和书写体有关？

medcl · 2017-01-05T02:57:41Z

我在本地没有重现呢

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

使用ik分词搜索 id:xiaoming666888 搜不到，但搜索id:xiaoming and id:666888 是可以搜索到的 #258

使用ik分词搜索 id:xiaoming666888 搜不到，但搜索id:xiaoming and id:666888 是可以搜索到的 #258

kubbo commented Aug 9, 2016 •

edited

Loading

nathan-zhu commented Aug 15, 2016 •

edited

Loading

ScsUndefined commented Sep 23, 2016

medcl commented Jan 5, 2017

使用ik分词搜索 id:xiaoming666888 搜不到，但搜索id:xiaoming and id:666888 是可以搜索到的 #258

使用ik分词搜索 id:xiaoming666888 搜不到，但搜索id:xiaoming and id:666888 是可以搜索到的 #258

Comments

kubbo commented Aug 9, 2016 • edited Loading

nathan-zhu commented Aug 15, 2016 • edited Loading

ScsUndefined commented Sep 23, 2016

medcl commented Jan 5, 2017

kubbo commented Aug 9, 2016 •

edited

Loading

nathan-zhu commented Aug 15, 2016 •

edited

Loading