-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Optimizing Text Embedding Processor #1138
Comments
I think the ideal name for this feature should |
I would vote for
|
i would vote for |
Optimizing Text Embedding Processor
Problem Statement
Proposal: #793
Text Embedding Processor is a processor that is defined as an input to the ingest pipeline to create vector embeddings from text. In current state, text embedding processor makes model inference calls for every document ingestion or update. While this approach is necessary for generating embeddings during initial document ingestion, it's unnecessary to regenerate them during document update if the embedding-related fields remain unchanged. This inefficiency can lead to unnecessary cost increase for customers and computational overhead for model inference. This document discusses the design to achieving optimization of the Text Embedding Processor in Neural Search.
Requirement
Out of Scope
Current State
Text Embedding Processor Configuration
Currently, Text Embedding Processor expects two fields:
model_id: A model to be used for creating vector embeddings
field_map: specifies the name of the field from which to take the text (
text
) and the name of the field in which to record embeddings (passage_embedding
)Reference: link
Current Flows
In current flow, there is no difference between ingestion/update of the document. The embeddings are created every time a document is ingested and/or updated in the Text Embedding Processor. See below for different use case scenarios:
Current Scenario 1: Single Document Update with embedding field
Steps:
MLCommonClientAccessor
Current Scenario 2: Single Document Update without embedding field
Steps:
Current Scenario 3: Single Document Update with both embedding and vector embedded fields
Steps:
Current Scenario 4: Single Document Update with only vector embedded field
Steps:
Proposed State
Proposed Text Embedding Processor Configuration
An optional flag ignore_unaltered will be supported as an input to text_embedding. If the flag is defined and set to false, Text Embedding Processor will attempt to skip eligible inference text. If the flag is not defined or set to true, Text Embedding Processor will behave as is today which will always make a call to model, without requiring a check on document state.
Alternative is to define the flag at the cluster level, which will enable/disable the optimization at the cluster level. Since this is a flag that pertains specifically to text embedding processor, the proposed configuration keeps the flag as part of parameter in text embedding processor only. If, in the future, other processors implement features for optimization, a general optimization flag can be set at the cluster level.
Proposed Flows
Initial Document Ingestion Flow
Document ingestion flow will not change. [Refer to Appendix 1.1 for prior proposal]
Update Document Flow
The updated flow will involve the existing
OpenSearchClient
defined through client in Neural Search. The client will serve to interact with APIs offered in OpenSearch Core. For this use case, the client will be used to fetch the already ingested Documents.Proposed Scenario 1: Single Document Update with single embedding field
Steps:
OpenSearchClient
Proposed Scenario 2: Single Document Update with multiple embedding fields
Steps:
OpenSearchClient
Proposed Scenario 3: Single Document Update without embedding field
No change to existing behavior. Model Inference will be skipped regardless of the feature, because the text field is missing - Check Current Flow - Current Scenario 2 for expected flow
User Scenario 4: Single Document Update with vector embedded field
Steps:
OpenSearchClient
User Scenario 4: Single Document Update with only vector embedded field
No change to existing behavior. Model Inference will be skipped regardless of the feature, because the text field is missing.
Check Current Flow - User Scenario 4 for expected flow
User Scenario 5: Batch Document Update
Steps:
OpenSearchClient
b. If a document does not pass the checks in step 3, inference call to ML Commons is made
Summary
The outcome of the proposed change can be summarized as follows:
Questions Considered
Appendix
1 . 1 Alternative approach with modified Ingestion Flow
In order to check whether a model has been updated, the ingested document needs to store model ID in its field, which associates model ID to the embeddings. Text Embedding processor will fetch the model ID when user updates the doc to ensure it has not changed before skipping call to create embeddings.
Request For Feedback
We would like to get some feedback on the name of the feature.
ignore_unaltered
is what we're proposing, but would appreciate suggestionsThe text was updated successfully, but these errors were encountered: