From a2fd652e6c4960c6f2a8308e872b5c6ae16d8ccd Mon Sep 17 00:00:00 2001 From: Dmitrii Ogn Date: Sat, 7 Sep 2024 00:53:25 +0300 Subject: [PATCH] Update ARCHITECTURE.md Typo fix --- ARCHITECTURE.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md index ee0a3743a8..e7b9ed1496 100644 --- a/ARCHITECTURE.md +++ b/ARCHITECTURE.md @@ -262,8 +262,8 @@ The fieldnorm is therefore compressed. Values up to 40 are encoded unchanged. ## [tokenizer/](src/tokenizer): How should we process text? Text processing is key to a good search experience. -Splits or normalize your text too much, and the search results will have a less precision and a higher recall. -Do not normalize, or under split your text, you will end up with a higher precision and a lesser recall. +Split or normalize your text too much, and the search results will have a less precision and a higher recall. +Do not normalize, or undersplit your text, you will end up with a higher precision and a lesser recall. Text processing can be configured by selecting an off-the-shelf [`Tokenizer`](./src/tokenizer/tokenizer.rs) or implementing your own to first split the text into tokens, and then chain different [`TokenFilter`](src/tokenizer/tokenizer.rs)'s to it.