Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(dgw): persistent job queue for crash resistance #1108

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Commits on Nov 14, 2024

  1. feat(dgw): persistent job queue for crash resistance

    This year we added some background tasks in the Gateway that should
    not be canceled, or if they are, should be restarted later. Essentially
    two tasks: mass deletion of recordings (relatively important, but
    it's always possible to launch indexing in DVLS in case of a problem)
    and remuxing recordings to webm format (good to have). If the service
    is killed in the middle of one of these operations, we should resume
    execution on the next startup.
    
    This persisent job queue is implemented using Turso’s libSQL. Using
    libSQL (or SQLite) for implementing the queue allow us to benefit from
    all the work put into implementing a reliable, secure and performant
    disk-based database instead of attempting to implement our own ad-hoc
    storage and debugging it forever.
    
    Inspiration was taken from 37signals' Solid Queue:
    
    - https://dev.37signals.com/introducing-solid-queue/
    - https://github.com/rails/solid_queue/
    
    And "How to build a job queue with Rust and PostgreSQL" from kerkour.com:
    
    - https://kerkour.com/rust-job-queue-with-postgresql
    
    The 'user_version' value, which is a SQLite PRAGMA, is used to keep track
    of the migration state. It's a very lightweight approach as it is just an
    integer at a fixed offset in the SQLite file.
    
    - https://sqlite.org/pragma.html#pragma_user_version
    - https://www.sqlite.org/fileformat.html#user_version_number
    
    Introducing Turso’s libSQL, as opposed to SQLite, will serve us for
    "Recording Farms" in the future. We’ll want instances of a same
    Recording Farm to coordinate. At this point, we’ll want to use Turso's
    libSQL network database feature. Indeed, putting the SQLite database
    file on a virtual filesystem is not recommended. This can lead to
    corruption and data loss. Turso will allow us to have a local mode for
    the simplest setups, and a network and distributed mode for Recording
    Farms when we get there.
    CBenoit committed Nov 14, 2024
    Configuration menu
    Copy the full SHA
    770222f View commit details
    Browse the repository at this point in the history
  2. .

    CBenoit committed Nov 14, 2024
    Configuration menu
    Copy the full SHA
    5927030 View commit details
    Browse the repository at this point in the history

Commits on Nov 15, 2024

  1. Configuration menu
    Copy the full SHA
    ebb72df View commit details
    Browse the repository at this point in the history
  2. .

    CBenoit committed Nov 15, 2024
    Configuration menu
    Copy the full SHA
    6763848 View commit details
    Browse the repository at this point in the history