Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Neural query is failing when used with explain by doc_id #1126

Open
martin-gaievski opened this issue Jan 21, 2025 · 1 comment
Open
Assignees
Labels
bug Something isn't working

Comments

@martin-gaievski
Copy link
Member

What is the bug?

There is an exception in neural query when I use it with the explain flag and in "by doc_id" mode. Response looks something like this, and there is no exceptions in the server log:

{
    "error": {
        "root_cause": [
            {
                "type": "query_shard_exception",
                "reason": "failed to create query: async actions are left after rewrite",
                "index": "my-nlp-index-1",
                "index_uuid": "I6_pzGG3QBWw0B6KeOJ1pA"
            }
        ],
        "type": "query_shard_exception",
        "reason": "failed to create query: async actions are left after rewrite",
        "index": "my-nlp-index-1",
        "index_uuid": "I6_pzGG3QBWw0B6KeOJ1pA",
        "caused_by": {
            "type": "illegal_state_exception",
            "reason": "async actions are left after rewrite"
        }
    },
    "status": 400
}

How can one reproduce the bug?

I tested it on latest main

Follow this steps, change ids from my examples to yours as needed

  1. Spin up new test cluster locally
  2. Update cluster level settings to allow ml running locally
PUT {{base_url}}/_cluster/settings
{
   "persistent":{
      "plugins.ml_commons.native_memory_threshold": 100,
        "plugins.ml_commons.only_run_on_ml_node": false,
        "plugins.ml_commons.allow_registering_model_via_url": true,
        "plugins.ml_commons.model_access_control_enabled": true
   }
}
  1. Create model group
POST {{base_url}}/_plugins/_ml/model_groups/_register
{
    "name": "test_model_group_public",
    "description": "This is a public model group"
}
  1. Register model, I used one from list of pre-trained OS
POST {{base_url}}/_plugins/_ml/models/_register?deploy=true
{
    "name": "huggingface/sentence-transformers/msmarco-distilbert-base-tas-b",
    "version": "1.0.1",
    "model_group_id": "2dlgipQBMpRflb8wsfvS",
    "model_format": "TORCH_SCRIPT"
}
  1. Deploy model
POST {{base_url}}/_plugins/_ml/models/29lgipQBMpRflb8w2vvM/_deploy
  1. Create ingest pipeline with you new model
PUT {{base_url}}/_ingest/pipeline/my-ingest-pipeline
{
    "description": "An NLP ingest pipeline",
    "processors": [
        {
            "text_embedding": {
                "model_id": "29lgipQBMpRflb8w2vvM",
                "field_map": {
                    "name": "passage_embedding"
                }
            }
        }
    ]
}
  1. Create knn enabled index
PUT {{base_url}}/my-nlp-index-1
{
    "settings": {
        "index.knn": true,
        "number_of_shards": 12,
        "number_of_replicas": 0,
        "default_pipeline": "my-ingest-pipeline"
    },
    "mappings": {
        "properties": {
            "passage_embedding": {
                "type": "knn_vector",
                "dimension": 768,
                "method": {
                    "name": "hnsw",
                    "engine": "lucene",
                    "parameters": {
                    }
                }
            }
        }
    }
}
  1. Ingest some data with bulk API
POST {{base_url}}/my-nlp-index-1/_bulk?refresh
{"index":{}}
{"field1": 2,"vector": [0.4, 0.5, 0.2],"title": "basic", "name": "A West Virginia university women 's basketball team , officials , and a small gathering of fans are in a West Virginia arena .", "category": "novel", "price": 20}
{"index":{}}
{ "name": "I brought home the trophy", "category": "story", "price": 20, "field1": 10,"vector": [0.2, 0.2, 0.3],"title": "java"}
{"index":{}}
{"field1": 50,"vector": [4.2, 5.5, 8.9],"name": "Why would he go to all that effort for a free pack of ranch dressing?", "category": "story", "price": 10 }
{"index":{}}
{"vector": [0.3, 0.12, 3.3],"title": "python","name": "In the next 40-50 years I plan on opening up my own business.","category": "poem","price": 100}
{"index":{}}
{  "field1": 100,"vector": [0.2, 0.2, 0.3],"title": "java", "name": "Does he have a big family?", "category": "biography", "price": 70}
{"index":{}}
{"name": "She is my younger sister","category": "workbook","price": 25}
  1. Get all docs, note one of the docs id
GET {{base_url}}/my-nlp-index-1/_search
{   
    "from": 0,
    "query" : {
        "match_all" : {}
    },
    "size": 2
}
  1. Run explain call with a neural query
GET {{base_url}}/my-nlp-index-1/_explain/mvd6ipQBCFBddLuA87Oh
{
    "query": {
        "neural": {
            "passage_embedding": {
                "query_text": "sports team",
                "model_id": "29lgipQBMpRflb8w2vvM",
                "k": 100
            }
        }
    }
}

For me the above query returns the response with error

Same query runs normally without explain flag or with the explain flag when used as parameter:

GET {{base_url}}/my-nlp-index-1/_search?explain=true or GET {{base_url}}/my-nlp-index-1/_search
{
    "size": 2,
    "_source": {
        "exclude": [
            "passage_embedding"
        ]
    },
    "query": {
        "neural": {
            "passage_embedding": {
                "query_text": "sports team",
                "model_id": "29lgipQBMpRflb8w2vvM",
                "k": 100
            }
        }
    }
}
@fen-qin
Copy link

fen-qin commented Jan 22, 2025

taking a look.

@heemin32 heemin32 moved this from Backlog to Backlog(Hot) in Neural Search RoadMap Jan 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Backlog(Hot)
Development

No branches or pull requests

3 participants