Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving Self-Hosting and Removing 3rd Party dependencies. #4513

Open
wants to merge 100 commits into
base: main
Choose a base branch
from

Conversation

Podginator
Copy link
Contributor

The intent of this PR is to improve the Self-Hosting documentation, to provide a working setup to get Omnivore working with Docker and Docker Compose. It intends to, as much as possible, remove third party dependencies and reliance on external infrastructure providers such as GCP.

The aim is to establish feature parity, or near feature parity to the previously hosted service. This includes RSS support, webhook support, email newsletter, and PDF Support.

The list of changes to date is below:

  • Create Dockerfile for Queue processing, which is used for parsing articles, alongside asynchronous tasks.

  • Update and expose ImageProxy and use the latest version with ARM64 support.

  • Create new docker-compose file in self-hosting/docker-compose.

  • Provide a minimal .env file to be able to run the service using docker-compose.

  • Created a guide for using Cloudflare Tunnels as a way to integrate with a device at your home.

  • Create a NGINX configuration for those looking to use NGINX Reverse Proxying for the service.

  • Replace use of Google Cloud Storage with Minio an open-source layer compatible with the S3 API that can run on Device.

    • This also allows other services, such as R2 and S3 to be the Storage Provider, if wanted.
  • Improvements to content-fetching to minimise instances where articles refused to parse.

    • Also improved to not use puppeteer for some articles, instead relying on raw html.
  • Overhaul the way email works, to ensure that there is an open source version. Three options are provided here.

    • Docker Mailserver: A production-ready fullstack but simple containerized mail server. This allows incoming emails to be received, parsed, and then added to Omnivore.
    • Amazon Simple Email Service A service provided by S3 that has a free tier. Allows for receiving of emails to a domain. Guide on how to set up in the Self-hosting readme.
    • Zapier: Used as a way to integrate gmail to hosting. This can be realistically achieved using some of the gmail apis, also.
  • Replace pspdfkit - Which required a license and would display the following when using PDFS image

    • Have an option for the Native Browser PDF Viewer for PDF Files. This removes the highlight functionality, but is stable.
      image
    • Create a new pdf viewer using PDF.js an open source pdf library used as the backing for the PDF viewer in firefox. This option includes near feature parity (highlights, reading progress) with the pspdfkit, but may have some bugs.
      image
  • Add some additional fixes to parsing articles, such as a Medium Parser, and a Wired parse

  • Updated Docker images and software to the latest LTS version of Node (20.12)

To-Do:

  • Re-Enable Youtube features - such as extraction of Transcripts.
    • Allow both an AI based feature for this, and a less formatted version.
  • Provide a guide on how to get up and running user Kubernetes.
  • Provide a guide on how to get up and running with Tailscale.
  • Provide a guide on getting email to work with G-Mail without the use of an external server.
  • Attempt to provide a lighter-weight queuing system, and removal of Redis/Caching for single-user hosting.

@jtbrough
Copy link

Hi all,

I was so disappointed when Omnivore got rugged. Not because I'm not happy for the team (wish them all success), but because ElevenReader is absolutely not equivalent, and I've seen nothing indicating that they're making real progress paid or otherwise.

The idea of a truly open / self-hosted alternative is absolutely the answer. So thankful to see the work being done by this group. Is there a plan to fork or takeover this project? I would like to sponsor the project if possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants