Add a highlighter #205

valencik · 2024-03-30T19:45:09Z

It's important to show users their query in the context of the resulting documents.
Consider the below example where the terms cats, effect, and effects are bolded in the search results display:

The design space for a highlighter is reasonably large. Lucene has several implementations.
I'm hoping we can get something basic without too much trouble.

The text was updated successfully, but these errors were encountered:

valencik · 2024-03-30T20:32:47Z

Collecting some rough thoughts here for a first attempt.

for each doc in docs
  for each fragment in doc
    score query against fragment
    update max scoring fragment for doc
  format max scoring fragment

What the heck is a fragment? Good question.
Ideally it's a small enough snippet of document content that you can comfortably render it on your search engine results page.
This could be "sentences", maybe it's "paragraphs", or perhaps "sections".
Clearly this would need to be configurable, as it depends a lot on your document structure.

Hopefully we can reuse a lot of existing pieces here.
For example, if we can get fragments for each doc then we can index the fragments as if they were documents, query that new fragment index, and take the top result.
Can we prepare some of this ahead of time? If we record the fragment boundaries at indexing time, perhaps we wouldn't need to create a new fragment index during the highlighting stage.

valencik · 2024-11-29T15:58:23Z

Some initial work done in #255

valencik mentioned this issue Nov 2, 2024

increase accuracy and test coverage for PlaintextRenderer #250

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a highlighter #205

Add a highlighter #205

valencik commented Mar 30, 2024

valencik commented Mar 30, 2024

valencik commented Nov 29, 2024

Add a highlighter #205

Add a highlighter #205

Comments

valencik commented Mar 30, 2024

valencik commented Mar 30, 2024

valencik commented Nov 29, 2024