From 32d5cc9376a9f159b0d21c127da4b171db9fed13 Mon Sep 17 00:00:00 2001 From: Shad Storhaug Date: Thu, 1 Feb 2024 15:56:47 +0700 Subject: [PATCH] Lucene.Net.Analysis.OpenNLP.overview.md: Corrected information about which filters are included in the package (there is no NER filter in the box) --- src/Lucene.Net.Analysis.OpenNLP/overview.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/src/Lucene.Net.Analysis.OpenNLP/overview.md b/src/Lucene.Net.Analysis.OpenNLP/overview.md index 1ea9d10636..882ddff08c 100644 --- a/src/Lucene.Net.Analysis.OpenNLP/overview.md +++ b/src/Lucene.Net.Analysis.OpenNLP/overview.md @@ -31,12 +31,14 @@ The OpenNLP Tokenizer behavior is similar to the . - segments text into sentences or words. This Tokenizer uses the OpenNLP Sentence Detector and/or Tokenizer classes. When used together, the Tokenizer receives sentences and can do a better job. - tags words using one or more technologies: Part-of-Speech, Chunking, and Named Entity Recognition. These tags are assigned as token types. Note that only one of these operations will tag +- segments text into sentences or words. This Tokenizer uses the OpenNLP Sentence Detector and/or Tokenizer classes. When used together, the Tokenizer receives sentences and can do a better job. +- tags words for Part-of-Speech and tags words for Chunking. These tags are assigned as token types. Note that only one of these operations will tag Since the is not stored in the index, it is recommended that one of these filters is used following OpenNLPFilter to enable search against the assigned tags: - copies the value to the - creates a cloned token at the same position as each tagged token, and copies the value to the , optionally with a customized prefix (so that tags effectively occupy a different namespace from token text). +- copies the value to the +- creates a cloned token at the same position as each tagged token, and copies the value to the , optionally with a customized prefix (so that tags effectively occupy a different namespace from token text). + +Named Entity Recognition is also supported by OpenNLP, but there is no OpenNLPNERFilter included. For an implementation, see the [lucenenet-opennlp-mavenreference-demo](https://github.com/NightOwl888/lucenenet-opennlp-mavenreference-demo). ## MavenReference Primer