Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Amazon Linux 2023: Invalid response from docker daemon: key \"ApiVersion\" is missing."} #1033

Open
tiendungitd opened this issue Jan 27, 2025 · 11 comments
Labels
docker-plain plain Docker (no swarm, no compose, no stack) third-party-bug This is a bug in some other software, not in ours

Comments

@tiendungitd
Copy link

SUMMARY

The docker_image and docker_container modules do not work in Ansible 2.15.13, I got error "Error connecting: Invalid response from docker daemon: key \"ApiVersion\" is missing."

ISSUE TYPE
  • Bug Report
COMPONENT NAME

docker_image
docker_container

ANSIBLE VERSION
ansible [core 2.15.3]
  config file = None
  configured module search path = ['/root/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python3.9/site-packages/ansible
  ansible collection location = /root/.ansible/collections:/usr/share/ansible/collections
  executable location = /usr/bin/ansible
  python version = 3.9.20 (main, Dec 11 2024, 00:00:00) [GCC 11.4.1 20230605 (Red Hat 11.4.1-2)] (/usr/bin/python3.9)
  jinja version = 3.1.4
  libyaml = True

COLLECTION VERSION
# /usr/lib/python3.9/site-packages/ansible_collections
Collection       Version
---------------- -------
community.docker 3.4.8

CONFIGURATION
CONFIG_FILE() = None
OS / ENVIRONMENT
NAME="Amazon Linux"
VERSION="2023"
ID="amzn"
ID_LIKE="fedora"
VERSION_ID="2023"
PLATFORM_ID="platform:al2023"
PRETTY_NAME="Amazon Linux 2023.6.20250115"
ANSI_COLOR="0;33"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2023"
HOME_URL="https://aws.amazon.com/linux/amazon-linux-2023/"
DOCUMENTATION_URL="https://docs.aws.amazon.com/linux/"
SUPPORT_URL="https://aws.amazon.com/premiumsupport/"
BUG_REPORT_URL="https://github.com/amazonlinux/amazon-linux-2023"
VENDOR_NAME="AWS"
VENDOR_URL="https://aws.amazon.com/"
SUPPORT_END="2028-03-15"

Docker version

Client:
 Version:           25.0.5
 API version:       1.44
 Go version:        go1.22.5
 Git commit:        5dc9bcc
 Built:             Wed Aug 21 00:00:00 2024
 OS/Arch:           linux/amd64
 Context:           default

Server:
 Engine:
  Version:          25.0.6
  API version:      1.44 (minimum version 1.24)
  Go version:       go1.22.5
  Git commit:       b08a51f
  Built:            Wed Aug 21 00:00:00 2024
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.7.23
  GitCommit:        57f17b0a6295a39009d861b89e3b3b87b005ca27
 runc:
  Version:          1.1.14
  GitCommit:        2c9f5602f0ba3d9da1c2596322dfc4e156844890
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

pip list|grep docker

docker 2.6.1
docker-pycreds 0.4.0
STEPS TO REPRODUCE
- hosts: localhost
  tasks:
   - name: pull an image
     docker_image:
       name: nginx
       source: pull
EXPECTED RESULTS

Can pull the docker image

ACTUAL RESULTS

TASK [pull an image] *****************************************************************************************************
task path: /home/ec2-user/play.yml:3
redirecting (type: modules) ansible.builtin.docker_image to community.docker.docker_image
<127.0.0.1> ESTABLISH LOCAL CONNECTION FOR USER: root
<127.0.0.1> EXEC /bin/sh -c 'echo ~root && sleep 0'
<127.0.0.1> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo /root/.ansible/tmp `"&& mkdir "` echo /root/.ansible/tmp/ansible-tmp-1737712652.1796768-27374-197009959574951 `" && echo ansible-tmp-1737712652.1796768-27374-197009959574951="` echo /root/.ansible/tmp/ansible-tmp-1737712652.1796768-27374-197009959574951 `" ) && sleep 0'
redirecting (type: modules) ansible.builtin.docker_image to community.docker.docker_image
Using module file /usr/lib/python3.9/site-packages/ansible_collections/community/docker/plugins/modules/docker_image.py
<127.0.0.1> PUT /root/.ansible/tmp/ansible-local-273279cuifomg/tmp38dssglk TO /root/.ansible/tmp/ansible-tmp-1737712652.1796768-27374-197009959574951/AnsiballZ_docker_image.py
<127.0.0.1> EXEC /bin/sh -c 'chmod u+x /root/.ansible/tmp/ansible-tmp-1737712652.1796768-27374-197009959574951/ /root/.ansible/tmp/ansible-tmp-1737712652.1796768-27374-197009959574951/AnsiballZ_docker_image.py && sleep 0'
<127.0.0.1> EXEC /bin/sh -c '/usr/bin/python3.9 /root/.ansible/tmp/ansible-tmp-1737712652.1796768-27374-197009959574951/AnsiballZ_docker_image.py && sleep 0'
<127.0.0.1> EXEC /bin/sh -c 'rm -f -r /root/.ansible/tmp/ansible-tmp-1737712652.1796768-27374-197009959574951/ > /dev/null 2>&1 && sleep 0'
The full traceback is:
  File "/tmp/ansible_docker_image_payload_ays7ct8s/ansible_docker_image_payload.zip/ansible_collections/community/docker/plugins/module_utils/common_api.py", line 117, in __init__
    super(AnsibleDockerClientBase, self).__init__(**self._connect_params)
  File "/tmp/ansible_docker_image_payload_ays7ct8s/ansible_docker_image_payload.zip/ansible_collections/community/docker/plugins/module_utils/_api/api/client.py", line 188, in __init__
    self._version = self._retrieve_server_version()
  File "/tmp/ansible_docker_image_payload_ays7ct8s/ansible_docker_image_payload.zip/ansible_collections/community/docker/plugins/module_utils/_api/api/client.py", line 207, in _retrieve_server_version
    raise DockerException(
fatal: [localhost]: FAILED! => {
    "changed": false,
    "invocation": {
        "module_args": {
            "api_version": "auto",
            "archive_path": null,
            "build": null,
            "ca_cert": null,
            "client_cert": null,
            "client_key": null,
            "debug": false,
            "docker_host": "unix://var/run/docker.sock",
            "force_absent": false,
            "force_source": false,
            "force_tag": false,
            "load_path": null,
            "name": "nginx",
            "pull": null,
            "push": false,
            "repository": null,
            "source": "pull",
            "ssl_version": null,
            "state": "present",
            "tag": "latest",
            "timeout": 60,
            "tls": false,
            "tls_hostname": null,
            "use_ssh_client": false,
            "validate_certs": false
        }
    },
    "msg": "Error connecting: Invalid response from docker daemon: key \"ApiVersion\" is missing."
}

PLAY RECAP ***************************************************************************************************************
localhost                  : ok=1    changed=0    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0


@felixfontein
Copy link
Collaborator

This collection is for Docker, not for Podman. You seem to be using it with Podman. That might work in some cases, but not in others.

Check that the Unix socket (/var/run/docker.sock) points to Docker and not to Podman. If you need podman to provide that socket, you need to provide the modules with the location of the Docker socket (docker_host option, can be set with the docker module defaults group for all modules).

(Also please note that you should better use community.docker.docker_image_pull for pulling an image.)

@felixfontein felixfontein added question Further information is requested docker-plain plain Docker (no swarm, no compose, no stack) labels Jan 27, 2025
@tiendungitd
Copy link
Author

tiendungitd commented Jan 27, 2025 via email

@felixfontein
Copy link
Collaborator

I don't use Amazon's cloud so I cannot easily check.

What does ls -lah /var/run/docker.sock and docker context ls --format json print?

@tiendungitd
Copy link
Author

tiendungitd commented Jan 28, 2025

Here is the output of the both commands.

ls -lah /var/run/docker.sock

srw-rw----. 1 root docker 0 Jan 28 07:12 /var/run/docker.sock

docker context ls --format json

{"Current":true,"Description":"Current DOCKER_HOST based configuration","DockerEndpoint":"unix:///var/run/docker.sock","Error":"","Name":"default"}

@jacekbilski
Copy link

jacekbilski commented Jan 28, 2025

Since a few days I have the same issue. And it was working a few weeks ago.

Ansible version
ansible [core 2.16.3]
  config file = /etc/ansible/ansible.cfg
  configured module search path = ['/home/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python3.12/site-packages/ansible
  ansible collection location = /home/.ansible/collections:/usr/share/ansible/collections
  executable location = /usr/bin/ansible
  python version = 3.12.8 (main, Dec 12 2024, 09:41:14) [GCC 8.5.0 20210514 (Red Hat 8.5.0-22)] (/usr/bin/python3.12)
  jinja version = 3.1.2
  libyaml = True
Collection version
# /usr/lib/python3.12/site-packages/ansible_collections
Collection       Version
---------------- -------
community.docker 3.7.0  
Docker version (on target machine, not control node)
Client: Docker Engine - Community
 Version:           27.5.0
 API version:       1.47
 Go version:        go1.22.10
 Git commit:        a187fa5
 Built:             Mon Jan 13 15:27:02 2025
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          27.5.0
  API version:      1.47 (minimum version 1.24)
  Go version:       go1.22.10
  Git commit:       38b84dc
  Built:            Mon Jan 13 15:25:55 2025
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.7.25
  GitCommit:        bcc810d6b9066471b0b6fa75f557a15a1cbf31bb
 runc:
  Version:          1.2.4
  GitCommit:        v1.2.4-0-g6c52b3f
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
Steps to reproduce
- name: Give Docker access to Registry
  community.docker.docker_login:
    registry_url: "[REDACTED]"
    username: [REDACTED]
    password: "[REDACTED]"

And when I call curl -s --unix-socket /var/run/docker.sock http://localhost/version on one of the target machines, that used to work, I get
{"Platform":{"Name":"Docker Engine - Community"},"Components":[{"Name":"Engine","Version":"27.5.0","Details":{"ApiVersion":"1.47","Arch":"amd64","BuildTime":"2025-01-13T15:25:55.000000000+00:00","Experimental":"false","GitCommit":"38b84dc","GoVersion":"go1.22.10","KernelVersion":"4.18.0-553.34.1.el8_10.x86_64","MinAPIVersion":"1.24","Os":"linux"}},{"Name":"containerd","Version":"1.7.25","Details":{"GitCommit":"bcc810d6b9066471b0b6fa75f557a15a1cbf31bb"}},{"Name":"runc","Version":"1.2.4","Details":{"GitCommit":"v1.2.4-0-g6c52b3f"}},{"Name":"docker-init","Version":"0.19.0","Details":{"GitCommit":"de40ad0"}}],"Version":"27.5.0","ApiVersion":"1.47","MinAPIVersion":"1.24","GitCommit":"38b84dc","GoVersion":"go1.22.10","Os":"linux","Arch":"amd64","KernelVersion":"4.18.0-553.34.1.el8_10.x86_64","BuildTime":"2025-01-13T15:25:55.000000000+00:00"}
And ApiVersion is there.

@tiendungitd
Copy link
Author

I think the issue has been fixed in the 2.16.3 version

@jacekbilski
Copy link

I'm using 2.16.3, and it's not working.

@felixfontein
Copy link
Collaborator

You could try to insert some debug statements to try to figure out what is going wrong. https://pypi.org/project/q/ is quite useful for that.

Also please note that your community.docker versions are pretty old. The latest 3.x.y version is 3.13.5, while you seem to have 3.4.8 resp. 3.7.0. (The current version is 4.3.1.)

@jafd
Copy link

jafd commented Jan 31, 2025

This error message is misleading. The code where the exception is happens makes a mistake in assuming that in this code fragment

    def _retrieve_server_version(self):
        try:
            resp = self.version(api_version=False)["ApiVersion"]
        except KeyError:
            raise DockerException(
                'Invalid response from docker daemon: key "ApiVersion"'
                ' is missing.'
            )
        except Exception as e:
            raise DockerException(
                'Error while fetching server API version: {0}'.format(e)
            )

the only KeyError that can ever happen is the one when we're trying to get ApiVersion. In fact, it's a different KeyError that comes from urllib3!

The full traceback is:
  File "/tmp/ansible_community.docker.docker_container_payload_w4whongr/ansible_community.docker.docker_container_payload.zip/ansible_collections/community/docker/plugins/module_utils/common_api.py", line 117, in __init__
    super(AnsibleDockerClientBase, self).__init__(**self._connect_params)
  File "/tmp/ansible_community.docker.docker_container_payload_w4whongr/ansible_community.docker.docker_container_payload.zip/ansible_collections/community/docker/plugins/module_utils/_api/api/client.py", line 188, in __init__
    self._version = self._retrieve_server_version()
  File "/tmp/ansible_community.docker.docker_container_payload_w4whongr/ansible_community.docker.docker_container_payload.zip/ansible_collections/community/docker/plugins/module_utils/_api/api/client.py", line 208, in _retrieve_server_version
    raise e
  File "/tmp/ansible_community.docker.docker_container_payload_w4whongr/ansible_community.docker.docker_container_payload.zip/ansible_collections/community/docker/plugins/module_utils/_api/api/client.py", line 205, in _retrieve_server_version
    resp = self.version(api_version=False)
  File "/tmp/ansible_community.docker.docker_container_payload_w4whongr/ansible_community.docker.docker_container_payload.zip/ansible_collections/community/docker/plugins/module_utils/_api/api/daemon.py", line 196, in version
    return self._result(self._get(url), json=True)
  File "/tmp/ansible_community.docker.docker_container_payload_w4whongr/ansible_community.docker.docker_container_payload.zip/ansible_collections/community/docker/plugins/module_utils/_api/utils/decorators.py", line 58, in inner
    return f(self, *args, **kwargs)
  File "/tmp/ansible_community.docker.docker_container_payload_w4whongr/ansible_community.docker.docker_container_payload.zip/ansible_collections/community/docker/plugins/module_utils/_api/api/client.py", line 230, in _get
    return self.get(url, **self._set_request_timeout(kwargs))
  File "/usr/lib/python3.9/site-packages/requests/sessions.py", line 557, in get
    return self.request('GET', url, **kwargs)
  File "/usr/lib/python3.9/site-packages/requests/sessions.py", line 544, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3.9/site-packages/requests/sessions.py", line 657, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3.9/site-packages/requests/adapters.py", line 467, in send
    conn = self._get_connection(request, verify, proxies)
  File "/usr/lib/python3.9/site-packages/requests/adapters.py", line 338, in _get_connection
    conn = self.poolmanager.connection_from_host(
  File "/usr/lib/python3.9/site-packages/urllib3/poolmanager.py", line 240, in connection_from_host
    return self.connection_from_context(request_context)
  File "/usr/lib/python3.9/site-packages/urllib3/poolmanager.py", line 250, in connection_from_context
    pool_key_constructor = self.key_fn_by_scheme[scheme]

(I got this by reraising the exception.)

This is exactly the issue #860 that #861 fixes. I can tell that this is the fix by how it starts working after I apply it on my instance.

However, if you inspect what versions requests are on the instance, you're in for a surprise:

python3-requests-2.25.1-1.amzn2023.0.4.noarch

Nowhere near 2.32! But wait. If you look at the same file that this commit changes, you'll see familiar code:

    def _get_connection(self, request, verify, proxies=None):
        # Replace the existing get_connection without breaking things and
        # ensure that TLS settings are considered when we interact with
        # urllib3 HTTP Pools
....

So, some Amazon Linux 2023 guys (or maybe a single overworked guy) backported things in requests without checking if they break anything in dependent packages (and Docker and Ansible are not obscure niche packages nobody is using).

Given, however, that amazonlinux/amazon-linux-2023#893 even exists (they have pushed a fixed version now), my opinion of their packaging team is very uncharitable right now.

@jafd
Copy link

jafd commented Jan 31, 2025

I have filed amazonlinux/amazon-linux-2023#902 because it's really AL2023 issue, my comrade in suffering.

@felixfontein
Copy link
Collaborator

@jafd thanks for debugging this! I agree that the assumption that the KeyError can only come from that access is wrong - this seems to be a common pattern in Python though. (Also this code got vendored from Docker SDK for Python:https://github.com/docker/docker-py/blob/db7f8b8bb67e485a7192846906f600a52e0aa623/docker/api/client.py#L221-L232) But even if written differently, str(KeyError('foo')) is simply foo, so the error would have been Error while fetching server API version: http+docker, which isn't that more helpful ;)

@felixfontein felixfontein changed the title Invalid response from docker daemon: key \"ApiVersion\" is missing."} Amazon Linux 2023: Invalid response from docker daemon: key \"ApiVersion\" is missing."} Jan 31, 2025
@felixfontein felixfontein added third-party-bug This is a bug in some other software, not in ours and removed question Further information is requested labels Feb 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docker-plain plain Docker (no swarm, no compose, no stack) third-party-bug This is a bug in some other software, not in ours
Projects
None yet
Development

No branches or pull requests

4 participants