Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could not connect to member XXXXXXXXX reason com.hazelcast.core.HazelcastException: java.io.IOException: null to address /localhost:5703 #177

Open
bugbugmaker opened this issue Mar 28, 2022 · 7 comments
Labels
question Further information is requested

Comments

@bugbugmaker
Copy link

Could not connect to member XXXXXXXXX reason com.hazelcast.core.HazelcastException: java.io.IOException: null to address /localhost:5703,What caused this

@saig0 saig0 added the question Further information is requested label Mar 30, 2022
@saig0
Copy link
Collaborator

saig0 commented Mar 30, 2022

I'm looking in my 🔮 ... and I can't see anything.

@bugbugmaker please provide more information.
Which application do you use?
How do you configure the environment?
Any interesting log file?

@bugbugmaker
Copy link
Author

bugbugmaker commented Mar 31, 2022

4662

I'm looking in my 🔮 ... and I can't see anything.

@bugbugmaker please provide more information. Which application do you use? How do you configure the environment? Any interesting log file?

zeebe config like this....

zeebe:
  broker:
    gateway:
      # Enable the embedded gateway to start on broker startup.
      # This setting can also be overridden using the environment variable ZEEBE_BROKER_GATEWAY_ENABLE.
      enable: true

      network:
        # Sets the port the embedded gateway binds to.
        # This setting can also be overridden using the environment variable ZEEBE_BROKER_GATEWAY_NETWORK_PORT.
        port: 26500

      security:
        # Enables TLS authentication between clients and the gateway
        # This setting can also be overridden using the environment variable ZEEBE_BROKER_GATEWAY_SECURITY_ENABLED.
        enabled: false

    network:
      # Controls the default host the broker should bind to. Can be overwritten on a
      # per binding basis for client, management and replication
      # This setting can also be overridden using the environment variable ZEEBE_BROKER_NETWORK_HOST.
      host: 0.0.0.0

    data:
      # Specify a directory in which data is stored.
      # This setting can also be overridden using the environment variable ZEEBE_BROKER_DATA_DIRECTORY.
      directory: data
      # The size of data log segment files.
      # This setting can also be overridden using the environment variable ZEEBE_BROKER_DATA_LOGSEGMENTSIZE.
      logSegmentSize: 512MB
      # How often we take snapshots of streams (time unit)
      # This setting can also be overridden using the environment variable ZEEBE_BROKER_DATA_SNAPSHOTPERIOD.
      snapshotPeriod: 15m

    cluster:
      # Specifies the Zeebe cluster size.
      # This can also be overridden using the environment variable ZEEBE_BROKER_CLUSTER_CLUSTERSIZE.
      clusterSize: 1
      # Controls the replication factor, which defines the count of replicas per partition.
      # This can also be overridden using the environment variable ZEEBE_BROKER_CLUSTER_REPLICATIONFACTOR.
      replicationFactor: 1
      # Controls the number of partitions, which should exist in the cluster.
      # This can also be overridden using the environment variable ZEEBE_BROKER_CLUSTER_PARTITIONSCOUNT.
      partitionsCount: 8

    threads:
      # Controls the number of non-blocking CPU threads to be used.
      # WARNING: You should never specify a value that is larger than the number of physical cores
      # available. Good practice is to leave 1-2 cores for ioThreads and the operating
      # system (it has to run somewhere). For example, when running Zeebe on a machine
      # which has 4 cores, a good value would be 2.
      # This setting can also be overridden using the environment variable ZEEBE_BROKER_THREADS_CPUTHREADCOUNT
      cpuThreadCount: 10
      # Controls the number of io threads to be used.
      # This setting can also be overridden using the environment variable ZEEBE_BROKER_THREADS_IOTHREADCOUNT
      ioThreadCount: 2
    backpressure:
      enabled: false
    exporters:
      hazelcast:
        className: io.zeebe.hazelcast.exporter.HazelcastExporter
        jarPath: exporters/zeebe-hazelcast-exporter-1.0.1-jar-with-dependencies.jar
        args:
        # Hazelcast port
          port: 5701
          enabledValueTypes: ""
          enabledRecordTypes: ""
          name: "zeebe"
          capacity: 100000
          timeToLiveInSeconds: 0
          format: "protobuf"

When I set partitionscount to 8, hazelcastclient will always print warning messages. The following is the error message

2022-03-31 09:59:46.722 [hz.client_1.internal-7] WARN c.h.client.impl.connection.ClientConnectionManager - 56 - hz.client_1 [dev] [4.2.2] Could not connect to member 1179bf3e-c6a9-4662-b4c3-4f70269692d1, reason com.hazelcast.core.HazelcastException: java.io.IOException: null to address /172.18.0.23:5705
2022-03-31 09:59:46.722 [hz.client_1.internal-1] WARN c.h.client.impl.connection.ClientConnectionManager - 56 - hz.client_1 [dev] [4.2.2] Could not connect to member 2aabbc4c-6770-4cd7-985f-728ef6b647b1, reason com.hazelcast.core.HazelcastException: java.io.IOException: null to address /172.18.0.23:5706
2022-03-31 09:59:46.722 [hz.client_1.internal-5] WARN c.h.client.impl.connection.ClientConnectionManager - 56 - hz.client_1 [dev] [4.2.2] Could not connect to member 52de3ec6-3460-4bf1-855a-31ba229e10f2, reason com.hazelcast.core.HazelcastException: java.io.IOException: null to address /172.18.0.23:5703
2022-03-31 09:59:46.722 [hz.client_1.internal-4] WARN c.h.client.impl.connection.ClientConnectionManager - 56 - hz.client_1 [dev] [4.2.2] Could not connect to member 6d745961-7231-40c8-8476-ca2aadf3a863, reason com.hazelcast.core.HazelcastException: java.io.IOException: null to address /172.18.0.23:5707
2022-03-31 09:59:46.723 [hz.client_1.internal-3] WARN c.h.client.impl.connection.ClientConnectionManager - 56 - hz.client_1 [dev] [4.2.2] Could not connect to member 9f9c97cd-9b78-4662-bc11-ed65e0aa4a94, reason com.hazelcast.core.HazelcastException: java.io.IOException: null to address /172.18.0.23:5704
2022-03-31 09:59:46.723 [hz.client_1.internal-9] WARN c.h.client.impl.connection.ClientConnectionManager - 56 - hz.client_1 [dev] [4.2.2] Could not connect to member 14eab7bc-46a2-4db2-9a67-91f1f9fd0ce5, reason com.hazelcast.core.HazelcastException: java.io.IOException: null to address /172.18.0.23:5708
2022-03-31 09:59:46.722 [hz.client_1.internal-6] WARN c.h.client.impl.connection.ClientConnectionManager - 56 - hz.client_1 [dev] [4.2.2] Could not connect to member a597935b-4127-4e92-b450-6028e7f5a986, reason com.hazelcast.core.HazelcastException: java.io.IOException: null to address /172.18.0.23:5702

What is the reason

@saig0
Copy link
Collaborator

saig0 commented Mar 31, 2022

Do you see the error messages in the Zeebe broker logs? Or, do you connect with an application to Hazelcast.
Please share more of these logs, for example, the complete stack trace, or related log messages.

Do you see the errors also if the partition count is not set to 8?

@bugbugmaker
Copy link
Author

bugbugmaker commented Mar 31, 2022

Do you see the error messages in the Zeebe broker logs? Or, do you connect with an application to Hazelcast. Please share more of these logs, for example, the complete stack trace, or related log messages.

Do you see the errors also if the partition count is not set to 8?

When the partition is the default value of 1, the connection is normal. When the partition is 8, the 5701 port is also normal. Zeebe always prints the following warning message when the partition is 8

2022-03-31 11:43:26.719 [Broker-0-SnapshotDirector-2] [Broker-0-zb-fs-workers-1] INFO
      io.camunda.zeebe.logstreams.snapshot - Finished taking snapshot, need to wait until last written event position 7838052 is committed, current commit position is 7838052. After that snapshot can be marked as valid.
2022-03-31 11:43:26.750 [Broker-0-SnapshotDirector-2] [Broker-0-zb-fs-workers-0] INFO
      io.camunda.zeebe.logstreams.snapshot - Current commit position 7838052 >= 7838052, snapshot 7100731-113-7838044-7838052 is valid and has been persisted.
2022-03-31 11:43:26.749 [Broker-0-SnapshotStore-2] [Broker-0-zb-fs-workers-1] WARN
      io.camunda.zeebe.snapshots.impl.FileBasedTransientSnapshot - Failed to delete pending snapshot FileBasedTransientSnapshot{directory=/usr/local/zeebe/data/raft-partition/partitions/2/pending/7100081-113-7837324-7837332, checksum=3240330177, snapshotStore=io.camunda.zeebe.snapshots.impl.FileBasedSnapshotStore@55d1a9dc, metadata=FileBasedSnapshotMetadata{index=7100081, term=113, processedPosition=7837324, exporterPosition=7837332}}
java.nio.file.NoSuchFileException: /usr/local/zeebe/data/raft-partition/partitions/2/pending/7100081-113-7837324-7837332
        at sun.nio.fs.UnixException.translateToIOException(Unknown Source) ~[?:?]
        at sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source) ~[?:?]
        at sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source) ~[?:?]
        at sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(Unknown Source) ~[?:?]
        at sun.nio.fs.UnixFileSystemProvider.readAttributes(Unknown Source) ~[?:?]
        at sun.nio.fs.LinuxFileSystemProvider.readAttributes(Unknown Source) ~[?:?]
        at java.nio.file.Files.readAttributes(Unknown Source) ~[?:?]
        at java.nio.file.FileTreeWalker.getAttributes(Unknown Source) ~[?:?]
        at java.nio.file.FileTreeWalker.visit(Unknown Source) ~[?:?]
        at java.nio.file.FileTreeWalker.walk(Unknown Source) ~[?:?]
        at java.nio.file.Files.walkFileTree(Unknown Source) ~[?:?]
        at java.nio.file.Files.walkFileTree(Unknown Source) ~[?:?]
        at io.camunda.zeebe.util.FileUtil.deleteFolderIfExists(FileUtil.java:58) ~[zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.snapshots.impl.FileBasedTransientSnapshot.abortInternal(FileBasedTransientSnapshot.java:144) ~[zeebe-snapshots-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.snapshots.impl.FileBasedTransientSnapshot.lambda$abort$2(FileBasedTransientSnapshot.java:92) ~[zeebe-snapshots-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorJob.invoke(ActorJob.java:73) [zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorJob.execute(ActorJob.java:39) [zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorTask.execute(ActorTask.java:122) [zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorThread.executeCurrentTask(ActorThread.java:94) [zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorThread.doWork(ActorThread.java:78) [zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorThread.run(ActorThread.java:191) [zeebe-util-1.0.0.jar:1.0.0]
2022-03-31 11:46:04.986 [Broker-0-SnapshotDirector-7] [Broker-0-zb-fs-workers-1] INFO
      io.camunda.zeebe.logstreams.snapshot - Finished taking snapshot, need to wait until last written event position 7822794 is committed, current commit position is 7822794. After that snapshot can be marked as valid.
2022-03-31 11:46:05.021 [Broker-0-SnapshotStore-7] [Broker-0-zb-fs-workers-1] WARN
      io.camunda.zeebe.snapshots.impl.FileBasedTransientSnapshot - Failed to delete pending snapshot FileBasedTransientSnapshot{directory=/usr/local/zeebe/data/raft-partition/partitions/7/pending/7096559-112-7822082-7822090, checksum=874232151, snapshotStore=io.camunda.zeebe.snapshots.impl.FileBasedSnapshotStore@2a18ff02, metadata=FileBasedSnapshotMetadata{index=7096559, term=112, processedPosition=7822082, exporterPosition=7822090}}
java.nio.file.NoSuchFileException: /usr/local/zeebe/data/raft-partition/partitions/7/pending/7096559-112-7822082-7822090
        at sun.nio.fs.UnixException.translateToIOException(Unknown Source) ~[?:?]
        at sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source) ~[?:?]
        at sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source) ~[?:?]
        at sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(Unknown Source) ~[?:?]
        at sun.nio.fs.UnixFileSystemProvider.readAttributes(Unknown Source) ~[?:?]
        at sun.nio.fs.LinuxFileSystemProvider.readAttributes(Unknown Source) ~[?:?]
        at java.nio.file.Files.readAttributes(Unknown Source) ~[?:?]
        at java.nio.file.FileTreeWalker.getAttributes(Unknown Source) ~[?:?]
        at java.nio.file.FileTreeWalker.visit(Unknown Source) ~[?:?]
        at java.nio.file.FileTreeWalker.walk(Unknown Source) ~[?:?]
        at java.nio.file.Files.walkFileTree(Unknown Source) ~[?:?]
        at java.nio.file.Files.walkFileTree(Unknown Source) ~[?:?]
        at io.camunda.zeebe.util.FileUtil.deleteFolderIfExists(FileUtil.java:58) ~[zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.snapshots.impl.FileBasedTransientSnapshot.abortInternal(FileBasedTransientSnapshot.java:144) ~[zeebe-snapshots-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.snapshots.impl.FileBasedTransientSnapshot.lambda$abort$2(FileBasedTransientSnapshot.java:92) ~[zeebe-snapshots-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorJob.invoke(ActorJob.java:73) [zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorJob.execute(ActorJob.java:39) [zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorTask.execute(ActorTask.java:122) [zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorThread.executeCurrentTask(ActorThread.java:94) [zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorThread.doWork(ActorThread.java:78) [zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorThread.run(ActorThread.java:191) [zeebe-util-1.0.0.jar:1.0.0]
2022-03-31 11:46:05.024 [Broker-0-SnapshotDirector-7] [Broker-0-zb-fs-workers-1] INFO
      io.camunda.zeebe.logstreams.snapshot - Current commit position 7822794 >= 7822794, snapshot 7097204-112-7822786-7822794 is valid and has been persisted.
2022-03-31 11:46:17.632 [Broker-0-SnapshotDirector-8] [Broker-0-zb-fs-workers-1] INFO
      io.camunda.zeebe.logstreams.snapshot - Finished taking snapshot, need to wait until last written event position 7821679 is committed, current commit position is 7821679. After that snapshot can be marked as valid.
2022-03-31 11:46:17.667 [Broker-0-SnapshotDirector-8] [Broker-0-zb-fs-workers-0] INFO
      io.camunda.zeebe.logstreams.snapshot - Current commit position 7821679 >= 7821679, snapshot 7095641-112-7821671-7821679 is valid and has been persisted.
2022-03-31 11:46:17.667 [Broker-0-SnapshotStore-8] [Broker-0-zb-fs-workers-1] WARN
      io.camunda.zeebe.snapshots.impl.FileBasedTransientSnapshot - Failed to delete pending snapshot FileBasedTransientSnapshot{directory=/usr/local/zeebe/data/raft-partition/partitions/8/pending/7094992-112-7820951-7820959, checksum=360069983, snapshotStore=io.camunda.zeebe.snapshots.impl.FileBasedSnapshotStore@636c1790, metadata=FileBasedSnapshotMetadata{index=7094992, term=112, processedPosition=7820951, exporterPosition=7820959}}
java.nio.file.NoSuchFileException: /usr/local/zeebe/data/raft-partition/partitions/8/pending/7094992-112-7820951-7820959
        at sun.nio.fs.UnixException.translateToIOException(Unknown Source) ~[?:?]
        at sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source) ~[?:?]
        at sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source) ~[?:?]
        at sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(Unknown Source) ~[?:?]
        at sun.nio.fs.UnixFileSystemProvider.readAttributes(Unknown Source) ~[?:?]
        at sun.nio.fs.LinuxFileSystemProvider.readAttributes(Unknown Source) ~[?:?]
        at java.nio.file.Files.readAttributes(Unknown Source) ~[?:?]
        at java.nio.file.FileTreeWalker.getAttributes(Unknown Source) ~[?:?]
        at java.nio.file.FileTreeWalker.visit(Unknown Source) ~[?:?]
        at java.nio.file.FileTreeWalker.walk(Unknown Source) ~[?:?]
        at java.nio.file.Files.walkFileTree(Unknown Source) ~[?:?]
        at java.nio.file.Files.walkFileTree(Unknown Source) ~[?:?]
        at io.camunda.zeebe.util.FileUtil.deleteFolderIfExists(FileUtil.java:58) ~[zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.snapshots.impl.FileBasedTransientSnapshot.abortInternal(FileBasedTransientSnapshot.java:144) ~[zeebe-snapshots-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.snapshots.impl.FileBasedTransientSnapshot.lambda$abort$2(FileBasedTransientSnapshot.java:92) ~[zeebe-snapshots-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorJob.invoke(ActorJob.java:73) [zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorJob.execute(ActorJob.java:39) [zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorTask.execute(ActorTask.java:122) [zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorThread.executeCurrentTask(ActorThread.java:94) [zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorThread.doWork(ActorThread.java:78) [zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorThread.run(ActorThread.java:191) [zeebe-util-1.0.0.jar:1.0.0]
2022-03-31 11:46:18.665 [Broker-0-SnapshotDirector-3] [Broker-0-zb-fs-workers-1] INFO
      io.camunda.zeebe.logstreams.snapshot - Finished taking snapshot, need to wait until last written event position 7838701 is committed, current commit position is 7838701. After that snapshot can be marked as valid.
2022-03-31 11:46:18.729 [Broker-0-SnapshotStore-3] [Broker-0-zb-fs-workers-1] WARN
      io.camunda.zeebe.snapshots.impl.FileBasedTransientSnapshot - Failed to delete pending snapshot FileBasedTransientSnapshot{directory=/usr/local/zeebe/data/raft-partition/partitions/3/pending/7090268-113-7837973-7837981, checksum=73665585, snapshotStore=io.camunda.zeebe.snapshots.impl.FileBasedSnapshotStore@162b143c, metadata=FileBasedSnapshotMetadata{index=7090268, term=113, processedPosition=7837973, exporterPosition=7837981}}
java.nio.file.NoSuchFileException: /usr/local/zeebe/data/raft-partition/partitions/3/pending/7090268-113-7837973-7837981
        at sun.nio.fs.UnixException.translateToIOException(Unknown Source) ~[?:?]
        at sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source) ~[?:?]
        at sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source) ~[?:?]
        at sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(Unknown Source) ~[?:?]
        at sun.nio.fs.UnixFileSystemProvider.readAttributes(Unknown Source) ~[?:?]
        at sun.nio.fs.LinuxFileSystemProvider.readAttributes(Unknown Source) ~[?:?]
        at java.nio.file.Files.readAttributes(Unknown Source) ~[?:?]
        at java.nio.file.FileTreeWalker.getAttributes(Unknown Source) ~[?:?]
        at java.nio.file.FileTreeWalker.visit(Unknown Source) ~[?:?]
        at java.nio.file.FileTreeWalker.walk(Unknown Source) ~[?:?]
        at java.nio.file.Files.walkFileTree(Unknown Source) ~[?:?]
        at java.nio.file.Files.walkFileTree(Unknown Source) ~[?:?]
        at io.camunda.zeebe.util.FileUtil.deleteFolderIfExists(FileUtil.java:58) ~[zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.snapshots.impl.FileBasedTransientSnapshot.abortInternal(FileBasedTransientSnapshot.java:144) ~[zeebe-snapshots-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.snapshots.impl.FileBasedTransientSnapshot.lambda$abort$2(FileBasedTransientSnapshot.java:92) ~[zeebe-snapshots-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorJob.invoke(ActorJob.java:73) [zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorJob.execute(ActorJob.java:39) [zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorTask.execute(ActorTask.java:122) [zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorThread.executeCurrentTask(ActorThread.java:94) [zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorThread.doWork(ActorThread.java:78) [zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorThread.run(ActorThread.java:191) [zeebe-util-1.0.0.jar:1.0.0]
2022-03-31 11:46:18.730 [Broker-0-SnapshotDirector-3] [Broker-0-zb-fs-workers-1] INFO
      io.camunda.zeebe.logstreams.snapshot - Current commit position 7838701 >= 7838701, snapshot 7090919-113-7838693-7838701 is valid and has been persisted.
2022-03-31 11:48:05.620 [Broker-0-SnapshotDirector-5] [Broker-0-zb-fs-workers-0] INFO
      io.camunda.zeebe.logstreams.snapshot - Finished taking snapshot, need to wait until last written event position 7822777 is committed, current commit position is 7822761. After that snapshot can be marked as valid.
2022-03-31 11:48:05.686 [Broker-0-SnapshotDirector-5] [Broker-0-zb-fs-workers-1] INFO
      io.camunda.zeebe.logstreams.snapshot - Current commit position 7822777 >= 7822777, snapshot 7095352-112-7822753-7822761 is valid and has been persisted.
2022-03-31 11:48:05.686 [Broker-0-SnapshotStore-5] [Broker-0-zb-fs-workers-0] WARN
      io.camunda.zeebe.snapshots.impl.FileBasedTransientSnapshot - Failed to delete pending snapshot FileBasedTransientSnapshot{directory=/usr/local/zeebe/data/raft-partition/partitions/5/pending/7094728-112-7822049-7822057, checksum=2236401017, snapshotStore=io.camunda.zeebe.snapshots.impl.FileBasedSnapshotStore@3943bc18, metadata=FileBasedSnapshotMetadata{index=7094728, term=112, processedPosition=7822049, exporterPosition=7822057}}
java.nio.file.NoSuchFileException: /usr/local/zeebe/data/raft-partition/partitions/5/pending/7094728-112-7822049-7822057
        at sun.nio.fs.UnixException.translateToIOException(Unknown Source) ~[?:?]
        at sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source) ~[?:?]
        at sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source) ~[?:?]
        at sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(Unknown Source) ~[?:?]
        at sun.nio.fs.UnixFileSystemProvider.readAttributes(Unknown Source) ~[?:?]
        at sun.nio.fs.LinuxFileSystemProvider.readAttributes(Unknown Source) ~[?:?]
        at java.nio.file.Files.readAttributes(Unknown Source) ~[?:?]
        at java.nio.file.FileTreeWalker.getAttributes(Unknown Source) ~[?:?]
        at java.nio.file.FileTreeWalker.visit(Unknown Source) ~[?:?]
        at java.nio.file.FileTreeWalker.walk(Unknown Source) ~[?:?]
        at java.nio.file.Files.walkFileTree(Unknown Source) ~[?:?]
        at java.nio.file.Files.walkFileTree(Unknown Source) ~[?:?]
        at io.camunda.zeebe.util.FileUtil.deleteFolderIfExists(FileUtil.java:58) ~[zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.snapshots.impl.FileBasedTransientSnapshot.abortInternal(FileBasedTransientSnapshot.java:144) ~[zeebe-snapshots-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.snapshots.impl.FileBasedTransientSnapshot.lambda$abort$2(FileBasedTransientSnapshot.java:92) ~[zeebe-snapshots-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorJob.invoke(ActorJob.java:73) [zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorJob.execute(ActorJob.java:39) [zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorTask.execute(ActorTask.java:122) [zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorThread.executeCurrentTask(ActorThread.java:94) [zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorThread.doWork(ActorThread.java:78) [zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorThread.run(ActorThread.java:191) [zeebe-util-1.0.0.jar:1.0.0]
2022-03-31 11:53:05.240 [Broker-0-SnapshotDirector-6] [Broker-0-zb-fs-workers-1] INFO
      io.camunda.zeebe.logstreams.snapshot - Finished taking snapshot, need to wait until last written event position 7822439 is committed, current commit position is 7822439. After that snapshot can be marked as valid.
2022-03-31 11:53:05.280 [Broker-0-SnapshotDirector-6] [Broker-0-zb-fs-workers-0] INFO
      io.camunda.zeebe.logstreams.snapshot - Current commit position 7822439 >= 7822439, snapshot 7092091-112-7822431-7822439 is valid and has been persisted.
2022-03-31 11:53:05.280 [Broker-0-SnapshotStore-6] [Broker-0-zb-fs-workers-1] WARN
      io.camunda.zeebe.snapshots.impl.FileBasedTransientSnapshot - Failed to delete pending snapshot FileBasedTransientSnapshot{directory=/usr/local/zeebe/data/raft-partition/partitions/6/pending/7091455-112-7821727-7821735, checksum=2456360609, snapshotStore=io.camunda.zeebe.snapshots.impl.FileBasedSnapshotStore@71d854ec, metadata=FileBasedSnapshotMetadata{index=7091455, term=112, processedPosition=7821727, exporterPosition=7821735}}
java.nio.file.NoSuchFileException: /usr/local/zeebe/data/raft-partition/partitions/6/pending/7091455-112-7821727-7821735
        at sun.nio.fs.UnixException.translateToIOException(Unknown Source) ~[?:?]
        at sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source) ~[?:?]
        at sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source) ~[?:?]
        at sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(Unknown Source) ~[?:?]
        at sun.nio.fs.UnixFileSystemProvider.readAttributes(Unknown Source) ~[?:?]
        at sun.nio.fs.LinuxFileSystemProvider.readAttributes(Unknown Source) ~[?:?]
        at java.nio.file.Files.readAttributes(Unknown Source) ~[?:?]
        at java.nio.file.FileTreeWalker.getAttributes(Unknown Source) ~[?:?]
        at java.nio.file.FileTreeWalker.visit(Unknown Source) ~[?:?]
        at java.nio.file.FileTreeWalker.walk(Unknown Source) ~[?:?]
        at java.nio.file.Files.walkFileTree(Unknown Source) ~[?:?]
        at java.nio.file.Files.walkFileTree(Unknown Source) ~[?:?]
        at io.camunda.zeebe.util.FileUtil.deleteFolderIfExists(FileUtil.java:58) ~[zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.snapshots.impl.FileBasedTransientSnapshot.abortInternal(FileBasedTransientSnapshot.java:144) ~[zeebe-snapshots-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.snapshots.impl.FileBasedTransientSnapshot.lambda$abort$2(FileBasedTransientSnapshot.java:92) ~[zeebe-snapshots-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorJob.invoke(ActorJob.java:73) [zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorJob.execute(ActorJob.java:39) [zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorTask.execute(ActorTask.java:122) [zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorThread.executeCurrentTask(ActorThread.java:94) [zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorThread.doWork(ActorThread.java:78) [zeebe-util-1.0.0.jar:1.0.0]
        at io.camunda.zeebe.util.sched.ActorThread.run(ActorThread.java:191) [zeebe-util-1.0.0.jar:1.0.0]
2022-03-31 11:53:27.477 [Broker-0-SnapshotDirector-1] [Broker-0-zb-fs-workers-0] INFO
      io.camunda.zeebe.logstreams.snapshot - Finished taking snapshot, need to wait until last written event position 16266928 is committed, current commit position is 16266928. After that snapshot can be marked as valid.
2022-03-31 11:53:27.557 [Broker-0-SnapshotDirector-1] [Broker-0-zb-fs-workers-1] INFO
      io.camunda.zeebe.logstreams.snapshot - Current commit position 16266928 >= 16266928, snapshot 11348558-144-16266920-16266928 is valid and has been persisted.
2022-03-31 11:53:27.556 [Broker-0-SnapshotStore-1] [Broker-0-zb-fs-workers-0] WARN
      io.camunda.zeebe.snapshots.impl.FileBasedTransientSnapshot - Failed to delete pending snapshot FileBasedTransientSnapshot{directory=/usr/local/zeebe/data/raft-partition/partitions/1/pending/11347913-144-16266200-16266208, checksum=1761440503, snapshotStore=io.camunda.zeebe.snapshots.impl.FileBasedSnapshotStore@1aedd7e6, metadata=FileBasedSnapshotMetadata{index=11347913, term=144, processedPosition=16266200, exporterPosition=16266208}}

@saig0
Copy link
Collaborator

saig0 commented Apr 1, 2022

I tried to reproduce the issue. Locally, I can run a Zeebe distribution with 8 partitions and the Hazelcast exporter without problems.

@bugbugmaker in which environment do you run Zeebe? {locally with the distro, docker, docker-compose, K8s, etc.}

@bugbugmaker
Copy link
Author

I tried to reproduce the issue. Locally, I can run a Zeebe distribution with 8 partitions and the Hazelcast exporter without problems.

@bugbugmaker in which environment do you run Zeebe? {locally with the distro, docker, docker-compose, K8s, etc.}

However, if I start locally, I will also report this warning message.

@bugbugmaker
Copy link
Author

locally

Whether the exporter is configured. If the number of partitions is greater than one, whether a mapping relationship needs to be established. Port 5701 is normal. Other ports will have warning messages.

Whether the exporter is configured. If the number of partitions is greater than one, whether a mapping relationship needs to be established. Port 5701 is normal. Other ports will have warning messages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants