Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broker in scheduler unable to connect to Redis #2496

Open
PKizzle opened this issue Jul 3, 2023 · 4 comments
Open

Broker in scheduler unable to connect to Redis #2496

PKizzle opened this issue Jul 3, 2023 · 4 comments
Assignees
Labels

Comments

@PKizzle
Copy link

PKizzle commented Jul 3, 2023

Bug report:

The broker in the scheduler logs a connection error signalling that it is unable to authenticate with the Redis server.
This is the error message that is repeated over and over again:
v1/worker.go:84 Broker failed with error: ERR AUTH <password> called without any password configured for the default user. Are you sure your configuration is correct?

Expected behavior:

Authentication succeeds and no error message is logged.

How to reproduce it:

  1. Run a Redis server behind a sentinel where the server has a password set for the default set but there is not password set for the sentinel
  2. Configure dragonfly using the official helm chart and specify the Redis password (set no username)
  3. Check the scheduler logs (might need to manipulate the helm chart to make the scheduler run with --console argument)

Environment:

  • Dragonfly version: v2.1.0-beta.1
  • OS: Raspberry Pi OS (Debian)
  • Kernel: 6.1.21-v8+ aarch64 GNU/Linux
  • Redis: 7.2-rc2
@PKizzle PKizzle added the bug label Jul 3, 2023
@gaius-qi gaius-qi self-assigned this Jul 3, 2023
@gaius-qi
Copy link
Member

gaius-qi commented Jul 3, 2023

@PKizzle Please send me the launch configuration for helm charts, thx.

@PKizzle
Copy link
Author

PKizzle commented Jul 6, 2023

These are all the changes I have made to the values.yaml included with the original helm chart.
I have redacted all sensitive information with shell script style variables.

scheduler:
  config:
    seedPeer:
      enable: false
  metrics:
    prometheusRule:
      enable: true
seedPeer:
  enable: false
dfdaemon:
  console: true
  download:
    totalRateLimit: 40Mi
    perPeerRateLimit: 20Mi
  upload:
    rateLimit: 20Mi
  objectStorage:
    enable: true
    maxReplicas: 1
  storage: 
    taskExpireTime: 3h
    strategy: io.d7y.storage.v2.advance
    diskGCThreshold: 4Gi
  network:
    enableIPv6: true
manager:
  ingress:
    enable: true
    className: "haproxy-internal"
    annotations:
      haproxy.org/ssl-redirect: "true"
    hosts:
      - "${DRAGONFLY_DOMAIN}"
  config:
    auth:
      jwt:
        key: "${JWT_KEY}"
    objectStorage:
      enable: true
      endpoint: "${S3_DOMAIN}"
      accessKey: "${AWS_ACCESS_KEY_ID}"
      secretKey: "${AWS_SECRET_ACCESS_KEY}"
    network:
      enableIPv6: true
    console: true
mysql:
  enable: false

# Custom addition to helm chart for postgres support
externalPostgres:
  migrate: true
  host: "${PGHOST}"
  username: "${PGUSER}"
  password: "${PGPASSWORD}"
  database: "dragonfly"
  port: 5432
  sslMode: require

redis:
  enable: false

externalRedis:
  addrs:
    - "rfs-dragonfly-redis.kube-system.svc.cluster.local:26379"
  masterName: "mymaster"
  username: null
  password: "${REDIS_PASSWORD}"
  db: 0
  brokerDB: 0
  backendDB: 0
  networkTopologyDB: 0

@PKizzle
Copy link
Author

PKizzle commented Jul 17, 2023

Also, the dependency used for the async (job) queue is no longer maintained: RichardKnop/machinery#790
I guess that this bug is connected with the usage of the abandoned project.

@PKizzle
Copy link
Author

PKizzle commented Jul 24, 2023

After taking a closer look at the machinery source code sentinel support is only provided for the go-redis implementation. The factor that decides whether redigo or go-redis are used is the number of broker addresses. Two or more addresses leads to the go-redis implementation being used. Thus adding an additional empty address in the values.yaml file fixes the issue.

I do not see any reason why machinery is using two different redis implementations and therefore highly recommend to switch to a different job queue dependency that does not introduce this kind of complexity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants