Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

whoosh library is outdated #13

Open
2 tasks
jdangerx opened this issue Jan 15, 2025 · 1 comment
Open
2 tasks

whoosh library is outdated #13

jdangerx opened this issue Jan 15, 2025 · 1 comment

Comments

@jdangerx
Copy link
Member

jdangerx commented Jan 15, 2025

Overview

The whoosh library we've been using for our keyword search is not maintained (thanks to @krivard for pointing this out!). It would be good to switch to something that is.

But, I also want to avoid deploying a whole new service to search for... 200 documents.

Looks like our main options are:

  • whoosh-reloaded - a maintained version of whoosh, though it doesn't seem that maintained. Latest update was about a year ago, and despite releasing version 3.0.0 the docs all still say 2.7.4.
  • xapian-bindings-binary - python bindings for the xapian search engine. old-school, 'battle-tested', but also kind of clunky
  • tantivy-py - python bindings for tantivy. new-school, basically a rust clone of lucene. An API that immediately made sense to me vs. the Xapian one.
  • dark horse: use sentence-transformers for semantic search!

At this point I think the last two are the most appealing.

Success Criteria

  • we don't depend on whoosh
  • search still returns reasonable results

Next steps

Preview Give feedback
@zaneselvans
Copy link
Member

Ah, the cycle of life (OSS). 👶🏼👧🏼👩🏼👵🏼🧟‍♀️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: New
Development

No branches or pull requests

2 participants