Heart beat coordinator failed, code: UNAVAILABLE, details: failed to connect to all addresses #962
Unanswered
zhangyuqiu
asked this question in
Q&A
Replies: 1 comment
-
I suggest to run |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Version: v0.7.0
Script:
sess = graphscope.session(num_workers=1,k8s_coordinator_cpu=1,k8s_coordinator_mem="4Gi", k8s_engine_cpu=4,k8s_engine_mem="4Gi")
Terminal output:
2021-11-08 21:22:08,840 [INFO][session:640]: Initializing graphscope session with parameters: {'addr': None, 'mode': 'eager', 'cluster_type': 'k8s', 'num_workers': 1, 'preemptive': True, 'k8s_namespace': None, 'k8s_service_type': 'NodePort', 'k8s_gs_image': 'registry.cn-hongkong.aliyuncs.com/graphscope/graphscope:0.7.0', 'k8s_etcd_image': 'quay.io/coreos/etcd:v3.4.13', 'k8s_image_pull_policy': 'IfNotPresent', 'k8s_image_pull_secrets': [], 'k8s_coordinator_cpu': 1.5, 'k8s_coordinator_mem': '2Gi', 'k8s_etcd_num_pods': 1, 'k8s_etcd_cpu': 1.0, 'k8s_etcd_mem': '512Mi', 'k8s_vineyard_daemonset': 'none', 'k8s_vineyard_cpu': 0.2, 'k8s_vineyard_mem': '512Mi', 'vineyard_shared_mem': '4Gi', 'k8s_engine_cpu': 8, 'k8s_engine_mem': '16Gi', 'k8s_mars_worker_cpu': 0.2, 'k8s_mars_worker_mem': '512Mi', 'k8s_mars_scheduler_cpu': 0.2, 'k8s_mars_scheduler_mem': '512Mi', 'with_mars': False, 'enable_gaia': False, 'reconnect': False, 'k8s_volumes': {'data': {'type': 'hostPath', 'field': {'path': '/root/gs_test_data', 'type': 'Directory'}, 'mounts': {'mountPath': '/testingdata/'}}}, 'k8s_waiting_for_delete': False, 'timeout_seconds': 600, 'dangling_timeout_seconds': 600, 'k8s_client_config': {}}
2021-11-08 21:22:08,881 [INFO][cluster:308]: Launching coordinator...
2021-11-08 21:22:11,914 [INFO][utils:167]: coordinator-fzcohs-f6977d9bc-qk24b: Successfully assigned gs-lbfimk/coordinator-fzcohs-f6977d9bc-qk24b to kind-control-plane
2021-11-08 21:22:11,914 [INFO][utils:167]: coordinator-fzcohs-f6977d9bc-qk24b: Container image "registry.cn-hongkong.aliyuncs.com/graphscope/graphscope:0.7.0" already present on machine
2021-11-08 21:22:11,914 [INFO][utils:167]: coordinator-fzcohs-f6977d9bc-qk24b: Created container coordinator
2021-11-08 21:22:11,914 [INFO][utils:167]: coordinator-fzcohs-f6977d9bc-qk24b: Started container coordinator
2021-11-08 21:22:11,765 [INFO][cluster:614]: Launching etcd ...
2021-11-08 21:22:12,787 [INFO][cluster:817]: Etcd is ready, endpoint is 10.96.199.204:58144
2021-11-08 21:22:12,787 [INFO][cluster:820]: Creating interactive engine service...
2021-11-08 21:22:12,787 [INFO][cluster:766]: Launching zetcd proxy service ...
2021-11-08 21:22:12,788 [INFO][cluster:781]: zetcd cmd /usr/local/bin/zetcd --zkaddr 0.0.0.0:2181 --endpoints http://gs-etcd-service-fzcohs:58144,http://gs-etcd-fzcohs-0:58144
Running zetcd proxy
Version: Version not provided (use make instead of go build)
SHA: SHA not provided (use make instead of go build)
2021-11-08 21:22:13,795 [INFO][cluster:810]: ZEtcd is ready, endpoint is 10.244.0.35:2181
2021-11-08 21:22:13,795 [INFO][cluster:828]: Creating engine replicaset...
2021-11-08 21:22:13,795 [INFO][cluster:505]: Launching GraphScope engines pod ...
2021-11-08 21:22:13,941 [INFO][cluster:879]: [gs-engine-fzcohs-bmmkf]: Successfully assigned gs-lbfimk/gs-engine-fzcohs-bmmkf to kind-control-plane
2021-11-08 21:22:16,993 [INFO][cluster:879]: [gs-engine-fzcohs-bmmkf]: Container image "registry.cn-hongkong.aliyuncs.com/graphscope/graphscope:0.7.0" already present on machine
2021-11-08 21:22:16,994 [INFO][cluster:879]: [gs-engine-fzcohs-bmmkf]: Created container engine
2021-11-08 21:22:16,995 [INFO][cluster:879]: [gs-engine-fzcohs-bmmkf]: Started container engine
2021-11-08 21:22:16,996 [INFO][cluster:879]: [gs-engine-fzcohs-bmmkf]: Created container vineyard
2021-11-08 21:22:16,997 [INFO][cluster:879]: [gs-engine-fzcohs-bmmkf]: Started container vineyard
2021-11-08 21:22:22,982 [INFO][utils:167]: coordinator-fzcohs-f6977d9bc-qk24b: Readiness probe failed: dial tcp 10.244.0.35:59194: connect: connection refused
2021-11-08 21:22:26,280 [INFO][cluster:915]: GraphScope engines pod is ready.
2021-11-08 21:22:26,282 [INFO][cluster:1052]: Engines pod name list: ['gs-engine-fzcohs-bmmkf']
2021-11-08 21:22:26,282 [INFO][cluster:1053]: Engines pod ip list: ['10.244.0.37']
2021-11-08 21:22:26,282 [INFO][cluster:1054]: Engines pod host ip list: ['172.19.0.2']
2021-11-08 21:22:26,282 [INFO][cluster:1056]: Vineyard service endpoint: 172.19.0.2:30747
2021-11-08 21:22:26,282 [INFO][cluster:941]: Starting GAE rpc service on 10.244.0.37:56691 ...
2021-11-08 21:22:26,522 [INFO][coordinator:1395]: Coordinator server listen at 0.0.0.0:59194
10.244.0.37 gs-engine-fzcohs-bmmkf
I1108 21:22:27.000000 39 /home/graphscope/gs/analytical_engine/core/grape_instance.cc:44] Workers of grape-engine initialized.
I1108 21:22:27.000000 42 /home/graphscope/gs/analytical_engine/core/server/analytical_server.cc:36] Analytical server is listening on 0.0.0.0:56691
2021-11-08 21:22:34,071 [INFO][cluster:556]: Coordinator pod start successful with address 172.19.0.2:31273, connecting to service ...
2021-11-08 21:22:54,076 [WARNING][rpc:127]: Heart beat coordinator failed, code: UNAVAILABLE, details: failed to connect to all addresses
2021-11-08 21:23:24,075 [WARNING][rpc:127]: Heart beat coordinator failed, code: UNAVAILABLE, details: failed to connect to all addresses
2021-11-08 21:23:44,075 [WARNING][rpc:127]: Heart beat coordinator failed, code: UNAVAILABLE, details: failed to connect to all addresses
2021-11-08 21:24:04,075 [WARNING][rpc:127]: Heart beat coordinator failed, code: UNAVAILABLE, details: failed to connect to all addresses
2021-11-08 21:24:24,075 [WARNING][rpc:127]: Heart beat coordinator failed, code: UNAVAILABLE, details: failed to connect to all addresses
2021-11-08 21:24:54,075 [WARNING][rpc:127]: Heart beat coordinator failed, code: UNAVAILABLE, details: failed to connect to all addresses
2021-11-08 21:25:24,077 [WARNING][rpc:127]: Heart beat coordinator failed, code: UNAVAILABLE, details: failed to connect to all addresses
2021-11-08 21:25:54,075 [WARNING][rpc:127]: Heart beat coordinator failed, code: UNAVAILABLE, details: failed to connect to all addresses
2021-11-08 21:26:24,077 [WARNING][rpc:127]: Heart beat coordinator failed, code: UNAVAILABLE, details: failed to connect to all addresses
2021-11-08 21:26:44,077 [WARNING][rpc:127]: Heart beat coordinator failed, code: UNAVAILABLE, details: failed to connect to all addresses
2021-11-08 21:27:24,075 [WARNING][rpc:127]: Heart beat coordinator failed, code: UNAVAILABLE, details: failed to connect to all addresses
2021-11-08 21:27:44,076 [WARNING][rpc:127]: Heart beat coordinator failed, code: UNAVAILABLE, details: failed to connect to all addresses
2021-11-08 21:28:04,076 [WARNING][rpc:127]: Heart beat coordinator failed, code: UNAVAILABLE, details: failed to connect to all addresses
2021-11-08 21:28:24,075 [WARNING][rpc:127]: Heart beat coordinator failed, code: UNAVAILABLE, details: failed to connect to all addresses
2021-11-08 21:28:44,075 [WARNING][rpc:127]: Heart beat coordinator failed, code: UNAVAILABLE, details: failed to connect to all addresses
2021-11-08 21:29:04,076 [WARNING][rpc:127]: Heart beat coordinator failed, code: UNAVAILABLE, details: failed to connect to all addresses
2021-11-08 21:29:24,075 [WARNING][rpc:127]: Heart beat coordinator failed, code: UNAVAILABLE, details: failed to connect to all addresses
2021-11-08 21:29:44,075 [WARNING][rpc:127]: Heart beat coordinator failed, code: UNAVAILABLE, details: failed to connect to all addresses
2021-11-08 21:30:04,076 [WARNING][rpc:127]: Heart beat coordinator failed, code: UNAVAILABLE, details: failed to connect to all addresses
2021-11-08 21:30:24,077 [WARNING][rpc:127]: Heart beat coordinator failed, code: UNAVAILABLE, details: failed to connect to all addresses
2021-11-08 21:30:54,076 [WARNING][rpc:127]: Heart beat coordinator failed, code: UNAVAILABLE, details: failed to connect to all addresses
2021-11-08 21:31:24,075 [WARNING][rpc:127]: Heart beat coordinator failed, code: UNAVAILABLE, details: failed to connect to all addresses
2021-11-08 21:31:44,075 [WARNING][rpc:127]: Heart beat coordinator failed, code: UNAVAILABLE, details: failed to connect to all addresses
2021-11-08 21:32:04,075 [WARNING][rpc:127]: Heart beat coordinator failed, code: UNAVAILABLE, details: failed to connect to all addresses
2021-11-08 21:32:24,075 [WARNING][rpc:127]: Heart beat coordinator failed, code: UNAVAILABLE, details: failed to connect to all addresses
2021-11-08 21:32:44,075 [WARNING][rpc:127]: Heart beat coordinator failed, code: UNAVAILABLE, details: failed to connect to all addresses
The script used to work fine but suddenly gave me this error today. Tried to reinstall environment but still not working. Have any idea about what causes this error?
Beta Was this translation helpful? Give feedback.
All reactions