-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Import Cluster fails on "node X doesn't have network details populated" with Ansible 2.8 #1084
Comments
…d" with Ansible 2.8 bugzilla: 1702412 tendrl-bug-id: Tendrl#1084 Signed-off-by: GowthamShanmugasundaram <[email protected]>
…d" with Ansible 2.8 bugzilla: 1702412 tendrl-bug-id: Tendrl#1084 Signed-off-by: GowthamShanmugasundaram <[email protected]>
I ran across this same sympton and could not recover from the error downgrading to ansible 2.7 nor trying to unmanage the cluster. The cluster is stuck in an error state. |
Does /tmp directory have execution permission? if not, please remount /tmp directory with exec permission: https://askubuntu.com/questions/311438/how-to-make-tmp-executable |
Yes, it does. On all nodes |
Please unmanage the cluster and wait for all the nodes will be detected by tendrl server. Fire import after all the nodes are listed with fqdn. |
The unmanage function is not working either. Nothing happens and I can't check on it's progress. |
Oh ok you already mentioned like un-manage is not working sorry :), Please check the log file in /var/log/messages. I think in each sync it may populate the error in the log file. |
If nothing working I will help you in a remote call to solve this problem |
I tried to clear everything by deleting my etcd partition, then installed tendrl againand this is what I got now: Child jobs failed are [u'02239281-7b28-4cec-8d11-7c9f6b82fca1'] Failed atom: tendrl.objects.Cluster.atoms.ImportCluster on flow: Import existing Gluster Cluster Failure in Job fdd906f4-4089-4e0b-9680-4f88aea55963 Flow tendrl.flows.ImportCluster with error: Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/tendrl/commons/jobs/init.py", line 240, in process_job the_flow.run() File "/usr/lib/python2.7/site-packages/tendrl/commons/flows/import_cluster/init.py", line 131, in run exc_traceback) FlowExecutionFailedError: ['Traceback (most recent call last):\n', ' File "/usr/lib/python2.7/site-packages/tendrl/commons/flows/import_cluster/init.py", line 98, in run\n super(ImportCluster, self).run()\n', ' File "/usr/lib/python2.7/site-packages/tendrl/commons/flows/init.py", line 186, in run\n (atom_fqn, self._defs['help'])\n', 'AtomExecutionFailedError: Atom Execution failed. Error: Error executing atom: tendrl.objects.Cluster.atoms.ImportCluster on flow: Import existing Gluster Cluster\n'] Failure in Job 02239281-7b28-4cec-8d11-7c9f6b82fca1 Flow tendrl.flows.ImportCluster with error: Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/tendrl/commons/jobs/init.py", line 240, in process_job the_flow.run() File "/usr/lib/python2.7/site-packages/tendrl/commons/flows/import_cluster/init.py", line 131, in run exc_traceback) FlowExecutionFailedError: ['Traceback (most recent call last):\n', ' File "/usr/lib/python2.7/site-packages/tendrl/commons/flows/import_cluster/init.py", line 98, in run\n super(ImportCluster, self).run()\n', ' File "/usr/lib/python2.7/site-packages/tendrl/commons/flows/init.py", line 213, in run\n ret_val = self._execute_atom(atom_fqn)\n', ' File "/usr/lib/python2.7/site-packages/tendrl/commons/flows/init.py", line 252, in _execute_atom\n parameters=self.parameters\n', ' File "/usr/lib/python2.7/site-packages/tendrl/commons/objects/cluster/atoms/configure_monitoring/init.py", line 110, in run\n "interface": self.get_node_interface(NS.node_context.fqdn),\n', ' File "/usr/lib/python2.7/site-packages/tendrl/commons/objects/cluster/atoms/configure_monitoring/init.py", line 80, in get_node_interface\n ip = socket.gethostbyname(fqdn)\n', 'TypeError: must be string, not None\n'] |
It clearly says FQDN of the node is not populated, some problem with node_context details sync. what is the version of tendrl rpms and ansible: |
Node-01: ansible-2.5.3-1.el7.noarch Node-02: centos-release-ansible26-1-3.el7.centos.noarch Node-03: ansible-2.8.0-2.el7.noarch node-remote: tendrl-ansible-1.6.3-2.el7.centos.noarch Note that running versions may differ as every node has ansible 2.7.0 for example: Another sympton: |
Ah! I got the problem, in the upstream release we are not yet included ansible 2.8 fix, it is still in the master repo only. Here except node1, all the nodes ansible version are 2.8, it should be less than 2.8 and greater than 2.5 (including tendrl-server). after downgraded restart tendrl-node-agent service in node as well as the server. Note: |
With Beta 1 build of Ansible 2.8, it's not possible to import Gluster Trusted
Storage pool into Tendrl, as Cluster Import task fails with error:
Node doesn't have network details populated
The text was updated successfully, but these errors were encountered: