Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Chunking for the reranker #121567

Open
dan-rubinstein opened this issue Feb 3, 2025 · 1 comment
Open

[ML] Chunking for the reranker #121567

dan-rubinstein opened this issue Feb 3, 2025 · 1 comment
Assignees
Labels
>enhancement :ml Machine learning Team:ML Meta label for the ML team

Comments

@dan-rubinstein
Copy link
Member

dan-rubinstein commented Feb 3, 2025

Description

While text_embedding and sparse_embedding are the first candidates to allow chunking at the inference call, we should also think about rerank. The current strategy today for Elasticsearch rerank is to truncate and we don't apply chunking in the text similarity retriever either. With chunking implemented in the rerank API we could also allow to extract the best fragments of the text with no additional cost and that's adaptable for any rerank provider.

Rerank is a good place to start but it is not straightforward for the Elastic reranker which is a cross-encoder model processing both query and document at the same time. The combined token count must be < 512 and the query length is variable which makes chunking the document more difficult, either it is done dynamically once the query is known or we pick a low number - say 256 tokens and truncate the query at 256 tokens.
Cohere truncate docs after 4096 tokens which is large enough for most reasonable chunk sizes

The purpose of this issue is to investigate how we can add chunking to rerank, design the solution, and implement it.

@dan-rubinstein dan-rubinstein added >enhancement needs:triage Requires assignment of a team area label :ml Machine learning Team:ML Meta label for the ML team labels Feb 3, 2025
@elasticsearchmachine elasticsearchmachine removed the needs:triage Requires assignment of a team area label label Feb 3, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :ml Machine learning Team:ML Meta label for the ML team
Projects
None yet
Development

No branches or pull requests

2 participants