Skip to content

Commit

Permalink
Updating (#183)
Browse files Browse the repository at this point in the history
* Update README.md

* Update subgen.py

* Fix detect language flow for non-bazarr

* Bump version

* Fix pyav error catch

* Fix for LRC putting newlines inappropriately

* Hopefully fix blank exceptions

next() may have been exhausted in previous version too early.  Trying a list instead and other catches.

* version bump because I'm an idiot

* Take 2 on has_subtitle_language_in_file

* Added plex ability to queue future episodes.  

PLEX_QUEUE_NEXT_EPISODE and PLEX_QUEUE_SERIES

* Update README.md

* Removed debugging statements

* Fix subtitle naming for translate

Always default the subtitle name to eng, unless namesublang set to something else.

* Fix semicolon because i'm editing in a browser on my phone...

* Fix subtitle naming logic

* Version bump and fix default actions if translating to english

* Update launcher.py

* Fixed deprecated call to transcribe_stable -> transcribe

* Clarify Bazarr setup

* Update README.md

* Potential fix for garbage collector

* Fix LRC generation not being skipped properly when it already exists.

* Move LRC check elsewhere

* somehow deleted the name of a function...

* Add Afar

* Fix Afar typo...

* Fix for monitor files

* Properly de-duplicate the queue and processing

* Define the queue properly...

* Fix where task_queue is defined.

* Don't purge model with active transcriptions.

* Add mka audio extension.

* Clean up some logging

More to come...

* Double log line removed

* renamed function and cleaned up readability

* fix for lrc files

* Attempt to make a ctranslate2 image with compute 5 capability

Should support older GPUs

* Create build_GPU_Compute5.yml

* Update build_GPU_Compute5.yml

* Update Dockerfile.compute5

* Update Dockerfile.compute5

* Update build_GPU_Compute5.yml

* Update build_GPU_Compute5.yml

* Update build_GPU_Compute5.yml

* Create Dockerfile.compute5.0

* Update build_GPU_Compute5.yml

* Update Dockerfile.compute5.0

* Delete Dockerfile.compute5.0

* Delete Dockerfile.compute5

* Delete .github/workflows/build_GPU_Compute5.yml

* attempt to make the image smaller

* Update Dockerfile.cpu

alpine doesn't have torch

* Update Dockerfile.cpu

* Update Dockerfile.cpu

* Update Dockerfile.cpu

* Update Dockerfile

* Update Dockerfile

* Update Dockerfile.cpu

* Update Dockerfile

* Update Dockerfile.cpu

* Update Dockerfile.cpu

* Update Dockerfile

* Update Dockerfile

* Update Dockerfile.cpu

* Update Dockerfile

* Update Dockerfile

* Update Dockerfile.cpu

* Update Dockerfile

* Print out which file we're actively working on and updated Queue functions

* Remove unused function

* Update calver.yml
  • Loading branch information
McCloudS authored Feb 6, 2025
1 parent 2c7f526 commit 1de6a10
Show file tree
Hide file tree
Showing 7 changed files with 426 additions and 159 deletions.
21 changes: 10 additions & 11 deletions .github/workflows/calver.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,23 +16,17 @@ jobs:
- name: Checkout code
uses: actions/checkout@v3
with:
# Fetch only the latest commit initially
fetch-depth: 1
# Fetch the full history, it's important here
fetch-depth: 0
ref: main

- name: Fetch commits for this month
run: |
# Fetch commits starting from the first day of the current month
YEAR=$(date +%Y)
MONTH=$(date +%m)
git fetch --shallow-since="$YEAR-$MONTH-01"
- name: Calculate version
id: version
run: |
# Calculate the commit count for this month
YEAR=$(date +%Y)
MONTH=$(date +%m)
# count commits since start of the month, limiting scope
COMMIT_COUNT=$(git rev-list --count HEAD --since="$YEAR-$MONTH-01")
echo "COMMIT_COUNT=$COMMIT_COUNT"
echo "VERSION=${YEAR}.${MONTH}.${COMMIT_COUNT}" >> $GITHUB_ENV
Expand All @@ -56,5 +50,10 @@ jobs:
# Amend the most recent commit, reusing the previous commit message
git commit --amend --reuse-message=HEAD --author="${GIT_AUTHOR_NAME} <${GIT_AUTHOR_EMAIL}>"
# Push the amended commit
git push --force
# Attempt a regular push first. If it fails because of remote changes, use --force-with-lease cautiously.
git push origin HEAD:main
# Alternative: Use --force-with-lease if a regular push fails
# This is much safer than --force, but still requires care
# If this fails as well (e.g., very recent conflict), you'll need manual intervention.
# git push --force-with-lease origin HEAD:main
48 changes: 33 additions & 15 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,23 +1,41 @@
# Stage 1: Builder
FROM nvidia/cuda:12.3.2-cudnn9-runtime-ubuntu22.04 AS builder

WORKDIR /subgen

# Install system dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
python3 \
python3-pip \
ffmpeg \
git \
&& rm -rf /var/lib/apt/lists/*

# Copy requirements and install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application code
COPY . .

# Stage 2: Runtime
FROM nvidia/cuda:12.3.2-cudnn9-runtime-ubuntu22.04

WORKDIR /subgen

ADD https://raw.githubusercontent.com/McCloudS/subgen/main/requirements.txt /subgen/requirements.txt
# Copy necessary files from the builder stage
COPY --from=builder /subgen/launcher.py .
COPY --from=builder /subgen/subgen.py .
COPY --from=builder /subgen/language_code.py .
COPY --from=builder /usr/local/lib/python3.10/dist-packages /usr/local/lib/python3.10/dist-packages

RUN apt-get update \
&& apt-get install -y \
python3 \
python3-pip \
ffmpeg \
git \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* \
&& pip3 install -r requirements.txt
# Install runtime dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
ffmpeg \
python3 \
&& rm -rf /var/lib/apt/lists/*

ENV PYTHONUNBUFFERED=1

ADD https://raw.githubusercontent.com/McCloudS/subgen/main/launcher.py /subgen/launcher.py
ADD https://raw.githubusercontent.com/McCloudS/subgen/main/subgen.py /subgen/subgen.py
ADD https://raw.githubusercontent.com/McCloudS/subgen/main/language_code.py /subgen/language_code.py

CMD [ "bash", "-c", "python3 -u launcher.py" ]
# Set command to run the application
CMD ["python3", "launcher.py"]
41 changes: 25 additions & 16 deletions Dockerfile.cpu
Original file line number Diff line number Diff line change
@@ -1,23 +1,32 @@
FROM python:3.11-slim-bullseye
# === Stage 1: Build dependencies and install packages ===
FROM python:3.11-slim-bullseye AS builder

WORKDIR /subgen

ADD https://raw.githubusercontent.com/McCloudS/subgen/main/requirements.txt /subgen/requirements.txt
# Install required build dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
ffmpeg \
git \
&& rm -rf /var/lib/apt/lists/*

RUN apt-get update \
&& apt-get install -y \
python3 \
python3-pip \
ffmpeg \
git \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* \
&& pip install -r requirements.txt
# Copy and install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir --prefix=/install torch torchaudio --extra-index-url https://download.pytorch.org/whl/cpu && pip install --no-cache-dir --prefix=/install -r requirements.txt --extra-index-url https://download.pytorch.org/whl/cpu

ENV PYTHONUNBUFFERED=1
# === Stage 2: Create a minimal runtime image ===
FROM python:3.11-slim-bullseye AS runtime

ADD https://raw.githubusercontent.com/McCloudS/subgen/main/launcher.py /subgen/launcher.py
ADD https://raw.githubusercontent.com/McCloudS/subgen/main/subgen.py /subgen/subgen.py
ADD https://raw.githubusercontent.com/McCloudS/subgen/main/language_code.py /subgen/language_code.py
WORKDIR /subgen

# Install only required runtime dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
ffmpeg \
&& rm -rf /var/lib/apt/lists/*

# Copy only necessary files from builder stage
COPY --from=builder /install /usr/local

# Copy source code
COPY launcher.py subgen.py language_code.py /subgen/

CMD [ "bash", "-c", "python3 -u launcher.py" ]
CMD ["python3", "launcher.py"]
29 changes: 22 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,10 @@
<details>
<summary>Updates:</summary>

23 Dec: Added PLEX_QUEUE_NEXT_EPISODE and PLEX_QUEUE_SERIES. Will automatically start generating subtitles for the next episode in your series, or queue the whole series.

4 Dec: Added more ENV settings: DETECT_LANGUAGE_OFFSET, PREFERRED_AUDIO_LANGUAGES, SKIP_IF_AUDIO_TRACK_IS, ONLY_SKIP_IF_SUBGEN_SUBTITLE, SKIP_UNKNOWN_LANGUAGE, SKIP_IF_LANGUAGE_IS_NOT_SET_BUT_SUBTITLES_EXIST, SHOULD_WHISPER_DETECT_AUDIO_LANGUAGE

30 Nov 2024: Signifcant refactoring and handling by Muisje. Added language code class for more robustness and flexibility and ability to separate audio tracks to make sure you get the one you want. New ENV Variables: SUBTITLE_LANGUAGE_NAMING_TYPE, SKIP_IF_AUDIO_TRACK_IS, PREFERRED_AUDIO_LANGUAGE, SKIP_IF_TO_TRANSCRIBE_SUB_ALREADY_EXIST

There will be some minor hiccups, so please identify them as we work through this major overhaul.
Expand Down Expand Up @@ -117,7 +121,15 @@ If you want to use a GPU, you need to map it accordingly.

#### Unraid

While Unraid doesn't have an app or template for quick install, with minor manual work, you can install it. See [https://github.com/McCloudS/subgen/issues/37](https://github.com/McCloudS/subgen/discussions/137) for pictures and steps.
While Unraid doesn't have an app or template for quick install, with minor manual work, you can install it. See [https://github.com/McCloudS/subgen/discussions/137](https://github.com/McCloudS/subgen/discussions/137) for pictures and steps.

## Bazarr

You only need to confiure the Whisper Provider as shown below: <br>
![bazarr_configuration](https://wiki.bazarr.media/Additional-Configuration/images/whisper_config.png) <br>
The Docker Endpoint is the ip address and port of your subgen container (IE http://192.168.1.111:9000) See https://wiki.bazarr.media/Additional-Configuration/Whisper-Provider/ for more info. **127.0.0.1 WILL NOT WORK IF YOU ARE RUNNING BAZARR IN A DOCKER CONTAINER!** I recomend not enabling using the Bazarr provider with other webhooks in Subgen, or you will likely be generating duplicate subtitles. If you are using Bazarr, path mapping isn't necessary, as Bazarr sends the file over http.

**The defaults of Subgen will allow it to run in Bazarr with zero configuration. However, you will probably want to change, at a minimum, `TRANSCRIBE_DEVICE` and `WHISPER_MODEL`.**

## Plex

Expand All @@ -131,12 +143,6 @@ Emby was really nice and provides good information in their responses, so we don

Remember, Emby and Subgen need to be able to see the exact same files at the exact same paths, otherwise you need `USE_PATH_MAPPING`.

## Bazarr

You only need to confiure the Whisper Provider as shown below: <br>
![bazarr_configuration](https://wiki.bazarr.media/Additional-Configuration/images/whisper_config.png) <br>
The Docker Endpoint is the ip address and port of your subgen container (IE http://192.168.1.111:9000) See https://wiki.bazarr.media/Additional-Configuration/Whisper-Provider/ for more info. I recomend not enabling this with other webhooks, or you will likely be generating duplicate subtitles. If you are using Bazarr, path mapping isn't necessary, as Bazarr sends the file over http.

## Tautulli

Create the webhooks in Tautulli with the following settings:
Expand Down Expand Up @@ -221,6 +227,15 @@ The following environment variables are available in Docker. They will default
| SKIP_IF_AUDIO_TRACK_IS | '' | Takes a pipe separated `\|` list of 3 letter language codes to skip if the file has audio in that language. This could be used to skip generating subtitles for a language you don't want, like, I speak English, don't generate English subtitles (for example: 'eng\|deu')|
| PREFERRED_AUDIO_LANGUAGE | 'eng' | If there are multiple audio tracks in a file, it will prefer this setting |
| SKIP_IF_TO_TRANSCRIBE_SUB_ALREADY_EXIST | True | Skips generation of subtitle if a file matches our desired language already. |
| DETECT_LANGUAGE_OFFSET | 0 | Allows you to shift when to run detect_language, geared towards avoiding introductions or songs. |
| PREFERRED_AUDIO_LANGUAGES | 'eng' | Pipe separated list |
| SKIP_IF_AUDIO_TRACK_IS | '' | Takes a pipe separated list of ISO 639-2 languages. Skips generation of subtitle if the file has the audio file listed. |
| ONLY_SKIP_IF_SUBGEN_SUBTITLE | False | Skips generation of subtitles if the file has "subgen" somewhere in the same |
| SKIP_UNKNOWN_LANGUAGE | False | Skips generation if the file has an unknown language |
| SKIP_IF_LANGUAGE_IS_NOT_SET_BUT_SUBTITLES_EXIST | False | Skips generation if file doesn't have an audio stream marked with a language |
| SHOULD_WHISPER_DETECT_AUDIO_LANGUAGE | False | Should Whisper try to detect the language if there is no audio language specified via force langauge |
| PLEX_QUEUE_NEXT_EPISODE | False | Will queue the next Plex series episode for subtitle generation if subgen is triggered. |
| PLEX_QUEUE_SERIES | False | Will queue the whole Plex series for subtitle generation if subgen is triggered. |

### Images:
`mccloud/subgen:latest` is GPU or CPU <br>
Expand Down
1 change: 1 addition & 0 deletions language_code.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

class LanguageCode(Enum):
# ISO 639-1, ISO 639-2/T, ISO 639-2/B, English Name, Native Name
AFAR = ("aa", "aar", "aar", "Afar", "Afar")
AFRIKAANS = ("af", "afr", "afr", "Afrikaans", "Afrikaans")
AMHARIC = ("am", "amh", "amh", "Amharic", "አማርኛ")
ARABIC = ("ar", "ara", "ara", "Arabic", "العربية")
Expand Down
2 changes: 1 addition & 1 deletion launcher.py
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ def main():

# Construct the argument parser
parser = argparse.ArgumentParser(prog="python launcher.py", formatter_class=argparse.ArgumentDefaultsHelpFormatter)
parser.add_argument('-d', '--debug', default=False, action='store_true', help="Enable console debugging")
parser.add_argument('-d', '--debug', default=True, action='store_true', help="Enable console debugging")
parser.add_argument('-i', '--install', default=False, action='store_true', help="Install/update all necessary packages")
parser.add_argument('-a', '--append', default=False, action='store_true', help="Append 'Transcribed by whisper' to generated subtitle")
parser.add_argument('-u', '--update', default=False, action='store_true', help="Update Subgen")
Expand Down
Loading

0 comments on commit 1de6a10

Please sign in to comment.