Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow NifiCluster Pod to be suspended when process crashes, to allow for debugging #492

Open
ChrisSamo632 opened this issue Nov 29, 2024 · 3 comments
Labels
community enhancement New feature or request

Comments

@ChrisSamo632
Copy link

Is your feature request related to a problem?

Configuration problems during NiFi startup can result in the NiFi process crashing, for example if no nifi.sensitive.props.key is defined for Cluster nodes.

The Pod is immediately terminated and replaced with a new Pod by NiFiKop, which will quickly fail again. It is very difficult to retrieve the logs/any information as to why the Pod keep crashing.

Describe the solution you'd like to see

A way to suspend the Pod for a set period before it is deleted by NiFiKop, e.g. a setting in the operator that waits for a set time before deleting and re-creating the Pod, to allow for log retrieval

Describe alternatives you've considered

No response

Additional context

No response

@ChrisSamo632 ChrisSamo632 added community enhancement New feature or request labels Nov 29, 2024
@juldrixx
Copy link
Contributor

juldrixx commented Nov 29, 2024

Hi @ChrisSamo632, you can use sidecars to tails the logs with a shared volume with the nifi container. So if NiFi crashes, you still have access to the logs. And you can add as much sidecar as you want with each one of the tailing a specific log file. Personnaly, that's what we do.

@juldrixx
Copy link
Contributor

Something like:

apiVersion: nifi.konpyutaika.com/v1
kind: NifiCluster
metadata:
  name: nifikop
spec:
  ...
  nodeConfigGroups:
    default_group:
      ...
      storageConfigs:
        - mountPath: /opt/nifi/nifi-current/logs
          name: logs
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 50Gi
            storageClassName: ssd-wait
  sidecarConfigs:
    - args:
        - tail
        - -n+1
        - -F
        - /var/log/nifi-app.log
      image: busybox:1.36
      name: app-log
      resources:
        limits:
          cpu: 50m
          memory: 50Mi
        requests:
          cpu: 50m
          memory: 50Mi
      volumeMounts:
        - mountPath: /var/log
          name: logs

@ChrisSamo632
Copy link
Author

Nice idea, I guess a problem is that I've redirected logging to STDOUT/STDERR to avoid having log files build up within the Pod/volume, so much of the logging is only available while the Pod's main container is running - these logs are intended to be collected by a central logging process within the cluster, although I've not enabled that yet, which probabl isn't helping me here, but I'm unsure on whether the errors would then be collected in the short time the Pod's online before crashing and being replaced

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants