Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plunk API is failing mere minutes after being restarted. #140

Open
MaxDusdal opened this issue Nov 24, 2024 · 5 comments
Open

Plunk API is failing mere minutes after being restarted. #140

MaxDusdal opened this issue Nov 24, 2024 · 5 comments
Labels
bug Something isn't working self-hosting Issues related to self-hosting Plunk

Comments

@MaxDusdal
Copy link

MaxDusdal commented Nov 24, 2024

I am currently running Coolify on a Hetzner VPS with a wildcard domain configured to the VPS. HTTPS is served using Let's Encrypt. I have deployed Coolify using both the Docker Compose setup provided in the official documentation and the one-click service package.

Initially, the instance functions as expected for about a minute. However, it subsequently encounters an issue where I am redirected back to the login page, and all API requests fail with a 502 Bad Gateway error.

I have reviewed all open issues in the Coolify GitHub repository related to similar problems (including mentions of "plunk") but have not found a resolution.

Notably, this behavior has so far only been reproducible using the Docker Compose setup from the documentation, and not the one-click install version provided by Coolify. When using the Coolify one-click installation, the /api route consistently times out—even if the frontend briefly appears functional in the browser. In contrast, when deploying via the Docker Compose setup, the /api route returns a response indicating that the backend is functional:

{ "code": 404, "error": "Not Found", "message": "Unknown route", "time": 1732484712303 }

Additionally, I observed that setting API_URL to https://$subdomain$/api during deployment works briefly before failing. However, redeploying with http://localhost:4000 as the API_URL allows the instance to run longer—until I refresh the browser without cache. At that point, the Next.js app's build manifest updates to reflect the localhost:4000 API URL, causing further disruptions. As of right now, i cannot explain this behaviour.

Despite significant time spent troubleshooting, I have been unable to identify the root cause or resolve this issue. Any guidance or insights would be greatly appreciated.

Environment variables are configured as follows:

`
API_URI=http://$my domain$/api
APP_URI=https://$my domain$

...aws stuff...
`

I have already looked into #44 and the errors are the same in my instance.

@ardasevinc
Copy link

can you try the new healtcheck route implemented in #137?

More info on #114

@rkcreation
Copy link

For me, even the new healthcheck seems to throw error and Plunk crashes (see my comment in #114 ), and it seems to crash even faster (cause of more regular calls to API I think)

@MaxDusdal
Copy link
Author

MaxDusdal commented Nov 28, 2024

So quick update, i will follow through with a bigger update and more or less of a guide if I can verify that everything keeps working. For now I've fixed the problem with the help of a community member of the Coolify Discord. If you want to see the whole troubleshooting steps please refer to the discussion on there.

The problem, or what I could take aways lies in the internal and external use of the API_URL. Externally the API_URL has to be https://domain.tld/api for clients to actually reach the backend but internally (to run the cron jobs e.g.) the API_URL is http://localhost:4000. The problem here is that the container has no knowledge of the external API_URL and it wouldn't make sense even if it could. We need to seperate the API_URL into NEXT_PUBLIC_API_URL and an actual API_URL.

The problem was already solved on #119 but sadly has been stale for a while but you can do the fork for yourself and create your own version for as long as this issue persists.

I didn't setup any specific healthcheck or something like that. And yes, the issue is similar to the one in #114

The updated docker compose would then look something like this.

services:
  plunk:
    build:
      context: 'https://github.com/MaxDusdal/plunk.git'
      dockerfile: Dockerfile
      args:
        - 'NEXT_PUBLIC_API_URI=${SERVICE_FQDN_PLUNK}/api'
    depends_on:
      postgresql:
        condition: service_healthy
      redis:
        condition: service_started
    environment:
      - SERVICE_FQDN_PLUNK_3000
      - 'REDIS_URL=redis://redis:6379'
      - 'DATABASE_URL=postgresql://${SERVICE_USER_POSTGRES}:${SERVICE_PASSWORD_POSTGRES}@postgresql/plunk-db?schema=public'
      - 'JWT_SECRET=${SERVICE_PASSWORD_JWTSECRET}'
      - 'AWS_REGION=${AWS_REGION:?}'
      - 'AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID:?}'
      - 'AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY:?}'
      - 'AWS_SES_CONFIGURATION_SET=${AWS_SES_CONFIGURATION_SET:?}'
      - 'NEXT_PUBLIC_API_URI=${SERVICE_FQDN_PLUNK}/api'
      - 'APP_URI=${SERVICE_FQDN_PLUNK}'
      - 'API_URI=${API_URL}'
      - 'DISABLE_SIGNUPS=${DISABLE_SIGNUPS:-False}'
    entrypoint:
      - /app/entry.sh
    healthcheck:
      test:
        - CMD
        - wget
        - '-q'
        - '--spider'
        - 'http://127.0.0.1:3000'
      interval: 2s
      timeout: 10s
      retries: 15
    image: 'driaug/plunk:latest'
  postgresql:
    image: 'postgres:16-alpine'
    environment:
      - 'POSTGRES_USER=${SERVICE_USER_POSTGRES}'
      - 'POSTGRES_PASSWORD=${SERVICE_PASSWORD_POSTGRES}'
      - 'POSTGRES_DB=${POSTGRES_DB:-plunk-db}'
    volumes:
      - 'plunk-postgresql-data:/var/lib/postgresql/data'
    healthcheck:
      test:
        - CMD-SHELL
        - 'pg_isready -U $${POSTGRES_USER} -d $${POSTGRES_DB}'
      interval: 5s
      timeout: 20s
      retries: 10
  redis:
    image: 'redis:7.4-alpine'
    volumes:
      - 'plunk-redis-data:/data'
    healthcheck:
      test:
        - CMD
        - redis-cli
        - PING
      interval: 5s
      timeout: 10s
      retries: 20
```

@emreloper
Copy link

emreloper commented Dec 1, 2024

Hey @MaxDusdal , I have a workaround without forking the repo. I just patch the replace-variables.sh script and mount it as a Docker volume. Here is the workaround.

services:
  plunk:
    image: "driaug/plunk:1.0.10"
    ports:
      - "3000:3000"
    depends_on:
      db:
        condition: service_healthy
      redis:
        condition: service_healthy
        
    environment:
      # These are the internal network endpoints.
      # These could be even localhost since they are used inside the same container.
      API_URI: http://plunk:3000/api
      APP_URI: http://plunk:3000

      # These are the public endpoints that are used by the client side JS.
      NEXT_PUBLIC_API_URI: https://yourdomain.com/api
      NEXT_PUBLIC_AWS_REGION: eu-central-1

    volumes:
      # This is the the script that replaces public env vars
      - ./replace-variables.sh:/app/replace-variables.sh

and here is the patched replace-variables.sh

#!/usr/bin/env bash
set -e

echo "Baking Environment Variables..."

if [ -z "${NEXT_PUBLIC_API_URI}" ]; then
    echo "NEXT_PUBLIC_API_URI is not set. Exiting..."
    exit 1
fi

if [ -z "${NEXT_PUBLIC_AWS_REGION}" ]; then
    echo "NEXT_PUBLIC_AWS_REGION is not set. Exiting..."
    exit 1
fi

# Process each directory that might contain JS files
for dir in "/app/packages/dashboard/public" "/app/packages/dashboard/.next"; do
    if [ -d "$dir" ]; then
        # Find all JS files and process them
        find "$dir" -type f -name "*.js" -o -name "*.mjs" | while read -r file; do
            if [ -f "$file" ]; then
                # Replace environment variables
                sed -i "s|PLUNK_API_URI|${NEXT_PUBLIC_API_URI}|g" "$file"
                sed -i "s|PLUNK_AWS_REGION|${NEXT_PUBLIC_AWS_REGION}|g" "$file"
                echo "Processed: $file"
            fi
        done
    else
        echo "Warning: Directory $dir does not exist, skipping..."
    fi
done

echo "Environment Variables Baked."

@driaug
Copy link
Member

driaug commented Dec 31, 2024

Duplicate of #114
Great solution above though.

@driaug driaug added bug Something isn't working self-hosting Issues related to self-hosting Plunk labels Dec 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working self-hosting Issues related to self-hosting Plunk
Projects
None yet
Development

No branches or pull requests

5 participants