From f382ff9b3f78198349452a002b1eebde7b5903c4 Mon Sep 17 00:00:00 2001
From: Sean Story <sean.j.story@gmail.com>
Date: Tue, 30 Jan 2024 01:26:31 -0600
Subject: [PATCH] Suggest chunking for large ELSER fields (#2660)

(cherry picked from commit f4dacc9dd2b116377ceea3c2707ad1f97356f582)
---
 docs/en/stack/ml/nlp/ml-nlp-elser.asciidoc | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/docs/en/stack/ml/nlp/ml-nlp-elser.asciidoc b/docs/en/stack/ml/nlp/ml-nlp-elser.asciidoc
index 80e37da4a..3596e58e4 100644
--- a/docs/en/stack/ml/nlp/ml-nlp-elser.asciidoc
+++ b/docs/en/stack/ml/nlp/ml-nlp-elser.asciidoc
@@ -408,9 +408,11 @@ image::images/ml-nlp-elser-v2-test.png[alt="Testing ELSER",align="center"]
 * ELSER works best on small-to-medium sized fields that contain natural 
 language. For connector or web crawler use cases, this aligns best with fields 
 like _title_, _description_, _summary_, or _abstract_. As ELSER encodes the 
-first 512 tokens of a field, it may not be as good a match for `body_content` on 
-web crawler documents, or body fields resulting from extracting text from office 
-documents with connectors.
+first 512 tokens of a field, it may not provide as relevant of results for large
+fields. For example, `body_content` on web crawler documents, or body fields 
+resulting from extracting text from office documents with connectors. For larger
+fields like these, consider "chunking" the content into multiple values, where
+each chunk can be under 512 tokens.
 * Larger documents take longer at ingestion time, and {infer} time per 
 document also increases the more fields in a document that need to be processed.
 * The more fields your pipeline has to perform inference on, the longer it takes 
@@ -521,4 +523,4 @@ image::images/ml-nlp-elser-v2-opt-bm-results.png[alt="ELSER V2 optimized benchma
 respectively 14 docs/s and 16 docs/s, indicating a performance improvement due 
 to virtual cores of 12%.
 
-image::images/ml-nlp-elser-v2-cp-bm-results.png[alt="ELSER V2 cross-platform benchmarks",align="center"]
\ No newline at end of file
+image::images/ml-nlp-elser-v2-cp-bm-results.png[alt="ELSER V2 cross-platform benchmarks",align="center"]