Allow NifiCluster Pod to be suspended when process crashes, to allow for debugging #492

ChrisSamo632 · 2024-11-29T20:34:49Z

Is your feature request related to a problem?

Configuration problems during NiFi startup can result in the NiFi process crashing, for example if no nifi.sensitive.props.key is defined for Cluster nodes.

The Pod is immediately terminated and replaced with a new Pod by NiFiKop, which will quickly fail again. It is very difficult to retrieve the logs/any information as to why the Pod keep crashing.

Describe the solution you'd like to see

A way to suspend the Pod for a set period before it is deleted by NiFiKop, e.g. a setting in the operator that waits for a set time before deleting and re-creating the Pod, to allow for log retrieval

Describe alternatives you've considered

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

juldrixx · 2024-11-29T20:38:14Z

Hi @ChrisSamo632, you can use sidecars to tails the logs with a shared volume with the nifi container. So if NiFi crashes, you still have access to the logs. And you can add as much sidecar as you want with each one of the tailing a specific log file. Personnaly, that's what we do.

juldrixx · 2024-11-29T20:39:57Z

Something like:

apiVersion: nifi.konpyutaika.com/v1
kind: NifiCluster
metadata:
  name: nifikop
spec:
  ...
  nodeConfigGroups:
    default_group:
      ...
      storageConfigs:
        - mountPath: /opt/nifi/nifi-current/logs
          name: logs
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 50Gi
            storageClassName: ssd-wait
  sidecarConfigs:
    - args:
        - tail
        - -n+1
        - -F
        - /var/log/nifi-app.log
      image: busybox:1.36
      name: app-log
      resources:
        limits:
          cpu: 50m
          memory: 50Mi
        requests:
          cpu: 50m
          memory: 50Mi
      volumeMounts:
        - mountPath: /var/log
          name: logs

ChrisSamo632 · 2024-11-30T07:42:25Z

Nice idea, I guess a problem is that I've redirected logging to STDOUT/STDERR to avoid having log files build up within the Pod/volume, so much of the logging is only available while the Pod's main container is running - these logs are intended to be collected by a central logging process within the cluster, although I've not enabled that yet, which probabl isn't helping me here, but I'm unsure on whether the errors would then be collected in the short time the Pod's online before crashing and being replaced

ChrisSamo632 added community enhancement New feature or request labels Nov 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow NifiCluster Pod to be suspended when process crashes, to allow for debugging #492

Allow NifiCluster Pod to be suspended when process crashes, to allow for debugging #492

ChrisSamo632 commented Nov 29, 2024

juldrixx commented Nov 29, 2024 •

edited

Loading

juldrixx commented Nov 29, 2024

ChrisSamo632 commented Nov 30, 2024

Allow NifiCluster Pod to be suspended when process crashes, to allow for debugging #492

Allow NifiCluster Pod to be suspended when process crashes, to allow for debugging #492

Comments

ChrisSamo632 commented Nov 29, 2024

Is your feature request related to a problem?

Describe the solution you'd like to see

Describe alternatives you've considered

Additional context

juldrixx commented Nov 29, 2024 • edited Loading

juldrixx commented Nov 29, 2024

ChrisSamo632 commented Nov 30, 2024

juldrixx commented Nov 29, 2024 •

edited

Loading