Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] enable handling synonyms in neural ingestion pipelines #545

Open
asfoorial opened this issue Jan 22, 2024 · 0 comments
Open

[FEATURE] enable handling synonyms in neural ingestion pipelines #545

asfoorial opened this issue Jan 22, 2024 · 0 comments

Comments

@asfoorial
Copy link

Handling synonyms is like an ABC thing when it comes to semantic search, however the current neural search plugin does not handle synonyms.

I suggest a feature to enable that out of the box without really going through the expense of fine-tuning embedding model. One simple way to do it is by simply replacing a word with both the word and its synonyms before generate a vector for them. The source text remains the same but the embedding gets to reflect the synonyms. The option can also be disabled in case the embedding model already understands synonyms.

Example:

Src="OS is amazing"
dst="[[Open Search]]os is amazing"
vec= getEmbedding(dst)

POST test/_doc/1
{"content":src,
"Vec":vec

}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Backlog
Development

No branches or pull requests

2 participants