Import Cluster fails on "node X doesn't have network details populated" with Ansible 2.8 #1084

GowthamShanmugam · 2019-04-24T10:33:19Z

With Beta 1 build of Ansible 2.8, it's not possible to import Gluster Trusted
Storage pool into Tendrl, as Cluster Import task fails with error:

Node doesn't have network details populated

…d" with Ansible 2.8 bugzilla: 1702412 tendrl-bug-id: Tendrl#1084 Signed-off-by: GowthamShanmugasundaram <[email protected]>

SalsaBr · 2019-06-17T16:34:59Z

I ran across this same sympton and could not recover from the error downgrading to ansible 2.7 nor trying to unmanage the cluster. The cluster is stuck in an error state.

GowthamShanmugam · 2019-06-18T02:59:07Z

Does /tmp directory have execution permission? if not, please remount /tmp directory with exec permission: https://askubuntu.com/questions/311438/how-to-make-tmp-executable

SalsaBr · 2019-06-18T21:34:43Z

Yes, it does. On all nodes

GowthamShanmugam · 2019-06-19T05:07:30Z

Please unmanage the cluster and wait for all the nodes will be detected by tendrl server. Fire import after all the nodes are listed with fqdn.

SalsaBr · 2019-06-21T18:07:01Z

The unmanage function is not working either. Nothing happens and I can't check on it's progress.

GowthamShanmugam · 2019-06-24T06:06:50Z

Oh ok you already mentioned like un-manage is not working sorry :), Please check the log file in /var/log/messages. I think in each sync it may populate the error in the log file.

GowthamShanmugam · 2019-06-24T06:07:21Z

If nothing working I will help you in a remote call to solve this problem

SalsaBr · 2019-06-24T21:08:51Z

I tried to clear everything by deleting my etcd partition, then installed tendrl againand this is what I got now:

Child jobs failed are [u'02239281-7b28-4cec-8d11-7c9f6b82fca1']

Failed atom: tendrl.objects.Cluster.atoms.ImportCluster on flow: Import existing Gluster Cluster

Failure in Job fdd906f4-4089-4e0b-9680-4f88aea55963 Flow tendrl.flows.ImportCluster with error: Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/tendrl/commons/jobs/init.py", line 240, in process_job the_flow.run() File "/usr/lib/python2.7/site-packages/tendrl/commons/flows/import_cluster/init.py", line 131, in run exc_traceback) FlowExecutionFailedError: ['Traceback (most recent call last):\n', ' File "/usr/lib/python2.7/site-packages/tendrl/commons/flows/import_cluster/init.py", line 98, in run\n super(ImportCluster, self).run()\n', ' File "/usr/lib/python2.7/site-packages/tendrl/commons/flows/init.py", line 186, in run\n (atom_fqn, self._defs['help'])\n', 'AtomExecutionFailedError: Atom Execution failed. Error: Error executing atom: tendrl.objects.Cluster.atoms.ImportCluster on flow: Import existing Gluster Cluster\n']

Failure in Job 02239281-7b28-4cec-8d11-7c9f6b82fca1 Flow tendrl.flows.ImportCluster with error: Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/tendrl/commons/jobs/init.py", line 240, in process_job the_flow.run() File "/usr/lib/python2.7/site-packages/tendrl/commons/flows/import_cluster/init.py", line 131, in run exc_traceback) FlowExecutionFailedError: ['Traceback (most recent call last):\n', ' File "/usr/lib/python2.7/site-packages/tendrl/commons/flows/import_cluster/init.py", line 98, in run\n super(ImportCluster, self).run()\n', ' File "/usr/lib/python2.7/site-packages/tendrl/commons/flows/init.py", line 213, in run\n ret_val = self._execute_atom(atom_fqn)\n', ' File "/usr/lib/python2.7/site-packages/tendrl/commons/flows/init.py", line 252, in _execute_atom\n parameters=self.parameters\n', ' File "/usr/lib/python2.7/site-packages/tendrl/commons/objects/cluster/atoms/configure_monitoring/init.py", line 110, in run\n "interface": self.get_node_interface(NS.node_context.fqdn),\n', ' File "/usr/lib/python2.7/site-packages/tendrl/commons/objects/cluster/atoms/configure_monitoring/init.py", line 80, in get_node_interface\n ip = socket.gethostbyname(fqdn)\n', 'TypeError: must be string, not None\n']

GowthamShanmugam · 2019-06-27T07:59:02Z

It clearly says FQDN of the node is not populated, some problem with node_context details sync. what is the version of tendrl rpms and ansible:
rpm -qa | grep tendrl
rpm -qa | grep ansible
run these command in the server as well as storage nodes.

SalsaBr · 2019-06-27T19:22:49Z

Node-01:
tendrl-collectd-selinux-1.5.4-2.el7.centos.noarch
tendrl-selinux-1.5.4-2.el7.centos.noarch
tendrl-commons-1.6.3-11.el7.noarch
tendrl-gluster-integration-1.6.3-10.el7.noarch
tendrl-node-agent-1.6.3-9.el7.noarch

ansible-2.5.3-1.el7.noarch

Node-02:
tendrl-collectd-selinux-1.5.4-2.el7.centos.noarch
tendrl-selinux-1.5.4-2.el7.centos.noarch
tendrl-node-agent-1.6.3-9.el7.noarch
tendrl-gluster-integration-1.6.3-10.el7.noarch
tendrl-commons-1.6.3-11.el7.noarch

centos-release-ansible26-1-3.el7.centos.noarch
ansible-2.8.0-2.el7.noarch

Node-03:
tendrl-collectd-selinux-1.5.4-2.el7.centos.noarch
tendrl-gluster-integration-1.6.3-10.el7.noarch
tendrl-selinux-1.5.4-2.el7.centos.noarch
tendrl-node-agent-1.6.3-9.el7.noarch
tendrl-commons-1.6.3-11.el7.noarch

ansible-2.8.0-2.el7.noarch

node-remote:
tendrl-notifier-1.6.3-4.el7.noarch
tendrl-ansible-1.6.3-2.el7.centos.noarch
tendrl-monitoring-integration-1.6.3-11.el7.noarch
tendrl-grafana-selinux-1.5.4-2.el7.centos.noarch
tendrl-collectd-selinux-1.5.4-2.el7.centos.noarch
tendrl-gluster-integration-1.6.3-10.el7.noarch
tendrl-selinux-1.5.4-2.el7.centos.noarch
tendrl-commons-1.6.3-11.el7.noarch
tendrl-api-1.6.3-7.el7.noarch
tendrl-api-httpd-1.6.3-7.el7.noarch
tendrl-node-agent-1.6.3-9.el7.noarch
tendrl-ui-1.6.3-10.el7.noarch
tendrl-grafana-plugins-1.6.3-11.el7.noarch

tendrl-ansible-1.6.3-2.el7.centos.noarch
ansible-2.8.0-2.el7.noarch
centos-release-ansible26-1-3.el7.centos.noarch

Note that running versions may differ as every node has ansible 2.7.0 for example:
ansible --version
ansible 2.7.0
config file = /etc/ansible/ansible.cfg
configured module search path = [u'/root/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
ansible python module location = /usr/lib/python2.7/site-packages/ansible
executable location = /usr/bin/ansible
python version = 2.7.5 (default, Apr 9 2019, 14:30:50) [GCC 4.8.5 20150623 (Red Hat 4.8.5-36)]

Another sympton:
Tendrl mentios 4 hosts discovered in the cluster but when I try to view these hosts I get a table with 3 hosts only. Tendrl server - which is a geo-rep hosts for the cluster - is missing. May be related as the 3 hosts being shown have correct names and IPs.

GowthamShanmugam · 2019-06-28T04:19:12Z

Ah! I got the problem, in the upstream release we are not yet included ansible 2.8 fix, it is still in the master repo only.

Here except node1, all the nodes ansible version are 2.8, it should be less than 2.8 and greater than 2.5 (including tendrl-server).

after downgraded restart tendrl-node-agent service in node as well as the server.

Note:
Don't install any gluster packages in tendrl-server

GowthamShanmugam mentioned this issue Apr 24, 2019

Import Cluster fails on "node X doesn't have network details populated" with Ansible 2.8 #1085

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Import Cluster fails on "node X doesn't have network details populated" with Ansible 2.8 #1084

Import Cluster fails on "node X doesn't have network details populated" with Ansible 2.8 #1084

GowthamShanmugam commented Apr 24, 2019 •

edited

Loading

SalsaBr commented Jun 17, 2019

GowthamShanmugam commented Jun 18, 2019

SalsaBr commented Jun 18, 2019

GowthamShanmugam commented Jun 19, 2019

SalsaBr commented Jun 21, 2019

GowthamShanmugam commented Jun 24, 2019

GowthamShanmugam commented Jun 24, 2019

SalsaBr commented Jun 24, 2019

GowthamShanmugam commented Jun 27, 2019

SalsaBr commented Jun 27, 2019

GowthamShanmugam commented Jun 28, 2019

Import Cluster fails on "node X doesn't have network details populated" with Ansible 2.8 #1084

Import Cluster fails on "node X doesn't have network details populated" with Ansible 2.8 #1084

Comments

GowthamShanmugam commented Apr 24, 2019 • edited Loading

SalsaBr commented Jun 17, 2019

GowthamShanmugam commented Jun 18, 2019

SalsaBr commented Jun 18, 2019

GowthamShanmugam commented Jun 19, 2019

SalsaBr commented Jun 21, 2019

GowthamShanmugam commented Jun 24, 2019

GowthamShanmugam commented Jun 24, 2019

SalsaBr commented Jun 24, 2019

GowthamShanmugam commented Jun 27, 2019

SalsaBr commented Jun 27, 2019

GowthamShanmugam commented Jun 28, 2019

GowthamShanmugam commented Apr 24, 2019 •

edited

Loading