Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"malformed node or string" on host-definer with topology-awareness enabled #662

Open
kolovo opened this issue Mar 15, 2023 · 3 comments
Open

Comments

@kolovo
Copy link

kolovo commented Mar 15, 2023

Hi,
The host-definer fails to read the secret with topology awareness and as a result host ports cannot be dynamically configured on the storage array.

The Secret is created according to official documentation:
https://www.ibm.com/docs/en/stg-block-csi-driver/1.11.0?topic=topology-creating-secret-awareness

Error message on host-definer:
2023-03-15 14:53:26,796 INFO [139832879761152] [Thread-5] (manager.py:_get_secret_data:293) - Reading secret ibm-nvme-topo-secret in namespace ibm-csi
Exception in thread Thread-5:
Traceback (most recent call last):
File "/usr/lib64/python3.8/threading.py", line 932, in _bootstrap_inner
self.run()
File "/usr/lib64/python3.8/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/driver/controllers/servers/host_definer/watcher/storage_class_watcher.py", line 29, in watch_storage_class_resources
secrets_info = self._get_secrets_info_from_storage_class_with_driver_provisioner(storage_class_info)
File "/driver/controllers/servers/host_definer/watcher/storage_class_watcher.py", line 38, in _get_secrets_info_from_storage_class_with_driver_provisioner
return self._get_secrets_info_from_storage_class(storage_class_info)
File "/driver/controllers/servers/host_definer/watcher/storage_class_watcher.py", line 49, in _get_secrets_info_from_storage_class
secret_data = self._get_secret_data(secret_name, secret_namespace)
File "/driver/controllers/servers/host_definer/kubernetes_manager/manager.py", line 295, in _get_secret_data
return self._change_decode_base64_secret_config(secret_data)
File "/driver/controllers/servers/host_definer/kubernetes_manager/manager.py", line 305, in _change_decode_base64_secret_config
secret_data[settings.SECRET_CONFIG_FIELD] = self._decode_base64_to_dict(
File "/driver/controllers/servers/host_definer/kubernetes_manager/manager.py", line 313, in _decode_base64_to_dict
my_dict_again = ast.literal_eval(base64.b64decode(base64_dict))
File "/usr/lib64/python3.8/ast.py", line 99, in literal_eval
return _convert(node_or_string)
File "/usr/lib64/python3.8/ast.py", line 98, in _convert
return _convert_signed_num(node)
File "/usr/lib64/python3.8/ast.py", line 75, in _convert_signed_num
return _convert_num(node)
File "/usr/lib64/python3.8/ast.py", line 66, in _convert_num
_raise_malformed_node(node)
File "/usr/lib64/python3.8/ast.py", line 63, in _raise_malformed_node
raise ValueError(f'malformed node or string: {node!r}')
ValueError: malformed node or string: b' {\n "dev-management-id-2": {\n "username": "demo",\n "password": "demo",\n "management_address": "192.168.1.11",\n "supported_topologies": [\n {\n "topology.block.csi.ibm.com/dc-region": "demo",\n "topology.block.csi.ibm.com/dc-zone": "demo-1"\n }\n ]\n },\n "dev-management-id-1": {\n "username": "demo",\n "password": "demo",\n "management_address": "192.168.1.10",\n "supported_topologies": [\n {\n "topology.block.csi.ibm.com/dc-region": "demo2",\n "topology.block.csi.ibm.com/dc-zone": "demo2-1"\n }\n ]\n }\n }\n'

Error message on scheduled pod :
Normal Scheduled 79s default-scheduler Successfully assigned default/task-pv-pod to mighty-ewe
Warning FailedAttachVolume 10s (x8 over 78s) attachdetach-controller AttachVolume.Attach failed for volume "pvc-4a11b9f7-c064-44dd-940b-c942f3d75dfd" : rpc error: code = NotFound desc = Host for node: Initiators(nvme_nqns=['nqn.2222-11.org.nvmexpress:uuid:82a355e4-f2ee-adsd-asdd-2385ec84ec2b'], fc_wwns=['101110234bdf67b8', '100000109bdfbc24'], iscsi_iqns=[]) was not found, ensure all host ports are configured on storage

From the error message its clear that the ast.literal_eval fuction receives as argument a byte data type and not a string data type that contains a byte data type. According to documentation the function works only with strings or node expressions.

The version that it is used is 1.10.0

Thank you

@kolovo kolovo changed the title "malformed node or string" host-definer whith topology-awareness enabled "malformed node or string" on host-definer whith topology-awareness enabled Mar 15, 2023
@kasserater
Copy link
Member

does this still happen with latest 1.11.3?
we were unable to reproduce this issue in our lab

@kasserater kasserater changed the title "malformed node or string" on host-definer whith topology-awareness enabled "malformed node or string" on host-definer with topology-awareness enabled Oct 7, 2024
@hermionito
Copy link

Hello

I use the topology awareness in an openshift context and encounter the same issue, currently using the v1.12.0

I followed this documentation: https://www.ibm.com/docs/en/stg-block-csi-driver/1.12.0?topic=topology-creating-secret-awareness

INFO    [139635005732672] [MainThread] (manager.py:_get_secret_data:293) - Reading secret ibm-block-csi in namespace ibm-block-csi
Traceback (most recent call last):
  File "/driver/controllers/servers/host_definer/main.py", line 10, in <module>
    main()
  File "/driver/controllers/servers/host_definer/main.py", line 6, in main
    host_definition_manager.start_host_definition()
  File "/driver/controllers/servers/host_definer/host_definer_manager.py", line 26, in start_host_definition
    self.storage_class_watcher.add_initial_storage_classes()
  File "/driver/controllers/servers/host_definer/watcher/storage_class_watcher.py", line 18, in add_initial_storage_classes
    secrets_info = self._get_secrets_info_from_storage_class_with_driver_provisioner(storage_class_info)
  File "/driver/controllers/servers/host_definer/watcher/storage_class_watcher.py", line 38, in _get_secrets_info_from_storage_class_with_driver_provisioner
    return self._get_secrets_info_from_storage_class(storage_class_info)
  File "/driver/controllers/servers/host_definer/watcher/storage_class_watcher.py", line 49, in _get_secrets_info_from_storage_class
    secret_data = self._get_secret_data(secret_name, secret_namespace)
  File "/driver/controllers/servers/host_definer/kubernetes_manager/manager.py", line 295, in _get_secret_data
    return self._change_decode_base64_secret_config(secret_data)
  File "/driver/controllers/servers/host_definer/kubernetes_manager/manager.py", line 305, in _change_decode_base64_secret_config
    secret_data[settings.SECRET_CONFIG_FIELD] = self._decode_base64_to_dict(
  File "/driver/controllers/servers/host_definer/kubernetes_manager/manager.py", line 313, in _decode_base64_to_dict
    my_dict_again = ast.literal_eval(base64.b64decode(base64_dict))
  File "/usr/lib64/python3.9/ast.py", line 105, in literal_eval
    return _convert(node_or_string)
  File "/usr/lib64/python3.9/ast.py", line 104, in _convert
    return _convert_signed_num(node)
  File "/usr/lib64/python3.9/ast.py", line 78, in _convert_signed_num
    return _convert_num(node)
  File "/usr/lib64/python3.9/ast.py", line 69, in _convert_num
    _raise_malformed_node(node)
  File "/usr/lib64/python3.9/ast.py", line 66, in _raise_malformed_node
    raise ValueError(f'malformed node or string: {node!r}')
ValueError: malformed node or string: b'{"id-y": {"username": "xxxxx", "password": "xxxxx", "management_address": "xx.xx.xx.xx", "supported_topologies": [{"topology.block.csi.ibm.com/region": "xx", "topology.block.csi.ibm.com/zone": "xx-y"}]}, "id-x": {"username": "xxxx", "password": "xxxxx", "management_address": "xx.xx.xx.xx", "supported_topologies": [{"topology.block.csi.ibm.com/region": "xx", "topology.block.csi.ibm.com/zone": "xx-x"}]}}'

The only way I found to use the hostdefiner, is to define non topology secrets and a simple sc foreach site in order to register the host on our SVC system.
After that I can deploy and use topo aware secret and sc.

Maybe I miss something in the process?

Thank you

@lechapitre
Copy link
Contributor

@hermionito

Can you please attach here:

  • The CLI that you use to create the secret
  • The content of the secret file (the one used in the CLI)
  • The output of the secret as stored by the cluster: kubectl get secret --output=json
    (you might need to replace kubectl with oc)

Of course, any secret-file that reproduces the issue should be ok (you don't have to include the actual labels or username/passwords that you use)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants