Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache node images #11202

Open
wants to merge 12 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 0 additions & 25 deletions roles/download/tasks/check_pull_required.yml

This file was deleted.

5 changes: 0 additions & 5 deletions roles/download/tasks/download_container.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,6 @@
tags:
- facts

- name: Download_container | Prepare container download
include_tasks: check_pull_required.yml
when:
- not download_always_pull

- debug: # noqa name[missing]
msg: "Pull {{ image_reponame }} required is: {{ pull_required }}"

Expand Down
59 changes: 58 additions & 1 deletion roles/download/tasks/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,39 @@
- download
- upload

- name: Download | Check if image command tool exists
command: "{{ image_command_tool }} --version"
register: image_command_tool_exists
changed_when: false
failed_when: false

- name: Download | Check if image command tool exists
stat:
path: "{{ bin_dir }}/{{ image_command_tool }}"
get_attributes: false
get_checksum: false
get_mime: false
register: image_command_tool_exists
changed_when: false

- name: Download | Process and download required files and images
when:
- not (download_always_pull or skip_downloads) | default(false)
- image_command_tool_exists.stat.exists
block:
- name: Download | Get image list from node
command: "{{ image_info_command }}"
register: node_images_raw
changed_when: false
check_mode: false
- name: Download | Set node_images
include_tasks: "{{ include_file }}"
vars:
include_file: "set_node_facts{% if image_command_tool == 'crictl' %}_crictl{% endif %}.yml"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this include gave us much, and it does break reading flow.
Can you instead put the task here and distinguish using when ?

- name: Download | Show node images
debug:
msg: "{{ node_images }}"

- name: Download | Download files / images
include_tasks: "{{ include_file }}"
loop: "{{ downloads | combine(kubeadm_images) | dict2items }}"
Expand All @@ -26,5 +59,29 @@
- not skip_downloads | default(false)
- download.enabled
- item.value.enabled
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would split this in two tasks (download_container / download_file) to simplify the loops and condition, see below

- (not (item.value.container | default(false))) or (item.value.container and download_container)
- >
(
not (
item.value.container | default(false)
)
)
or (
item.value.container and
download_container and
Comment on lines +62 to +70
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to get rid of those conditions by filtering the list before feeding it to the loop instead:

vars:
    base_dls: "{{ downloads | combine(kubeadm_images) | dict2items }}"
    containers_dls: "{{ base_dls | selectattr('value.container', 'defined') | selectattr('value.container') }}"

(
not (
(
node_images | selectattr('Repository', 'equalto', (item.value.repo | regex_replace('^docker\.io/(library/)?', ''))) | selectattr('Tag', 'equalto', item.value.tag) | list | length > 0
)
or (
node_images | selectattr('Digest', 'equalto', 'sha256:' ~ (item.value.sha256 | default(None))) | list | length > 0
)
)
or download_always_pull
)
)
- (download_run_once and inventory_hostname == download_delegate) or (group_names | intersect(download.groups) | length)

- name: Download | Show downloads
debug:
msg: "{{ downloads }}"
7 changes: 7 additions & 0 deletions roles/download/tasks/set_node_facts.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
# In case of nerdctl or docker, the output is already in the expected format
# {"Digest":"sha256:847423221ed040798e47df36450edce5581ed642cd087ccb210532764da38b23","Repository":"quay.io/coreos/etcd","Tag":"v3.5.12"}
# {"Digest":"sha256:34fc87c4a60c0b3ba3b3608871f4494de8072c02808a4151953068f2d7c87743","Repository":"flannel/flannel","Tag":"latest"}
- name: Download | Set node_images (nerdctl, docker)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT, we only use download.repo/sha256/tag in one place :

$ grep -R 'download\.tag' roles/download/
roles/download/tasks/set_container_facts.yml:      {%- if pull_by_digest %}{{ download.repo }}@sha256:{{ download.sha256 }}{%- else -%}{{ download.repo }}:{{ download.tag }}{%- endif -%}
roles/download/tasks/check_pull_required.yml:    that: "{{ download.repo }}:{{ download.tag }} in docker_images.stdout.split(',')"
```
Instead of separating those, maybe we could instead have `{'by_repo_digest': ..., by_repo_tag : 'flannel/flannel:latest' }`

In particular, with this filtering becomes easiers, because you can use `contains` on a fixed set (obtening from nodes images) while mapping)
(something like `containers_dls | selectattr('sha256', 'contains', (node_images | map(attribute='repo_by_digest'))` (wrong syntax I think can't remember the proper order on  top of my head)

set_fact:
node_images: "{{ node_images_raw.stdout_lines | map('from_json') | list }}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use a variable instead of a fact for this.

9 changes: 9 additions & 0 deletions roles/download/tasks/set_node_facts_crictl.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---
- name: Download | Set node_images (crictl)
set_fact:
node_images: >-
{{
(node_images | default([])) +
[{'Digest': (item.repoDigests[0] | split('@'))[1], 'Repository': ((item.repoTags[0] | split(':'))[0] | regex_replace('^docker\.io/(library/)?', '')), 'Tag': (item.repoTags[0] | split(':'))[1]}]
}}
with_items: "{{ (node_images_raw.stdout | from_json).images }}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can avoid looping over all images using some json_query and filters from community.general:

     node_images: "{{ ids | zip(repos, _tags) | map('zip', ['Digest', 'Repository', 'Tag'])
            | map('map', 'reverse') | map('community.general.dict') }}"
    vars:
      queried: "{{ node_images_raw.stdout | from_json | json_query(query) }}"
      query: "images[*].{hash: @.id, rep: @.repoTags[0]}"
      reps:  "{{ queried | map(attribute='rep') | map('split', ':') }}"
      ids: "{{ queried | map(attribute='hash') }}"
      _tags: "{{ reps | map(attribute=1) }}"
      repos: "{{ reps | map(attribute=0) }}"

(in order to avoid the loop overhead)

18 changes: 18 additions & 0 deletions roles/download/vars/main.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
---
node_images: []

docker_image_pull_command: "{{ docker_bin_dir }}/docker pull"
docker_image_info_command: "{{ docker_bin_dir }}/docker images --format='{\"Digest\":\"{{ '{{' }}.Digest{{ '}}' }}\",\"Repository\":\"{{ '{{' }}.Repository{{ '}}' }}\",\"Tag\":\"{{ '{{' }}.Tag{{ '}}' }}\"}'"
nerdctl_image_info_command: "{{ bin_dir }}/nerdctl images --format='{\"Digest\":\"{{ '{{' }}.Digest{{ '}}' }}\",\"Repository\":\"{{ '{{' }}.Repository{{ '}}' }}\",\"Tag\":\"{{ '{{' }}.Tag{{ '}}' }}\"}'"
# Using the ctr instead of nerdctl to workdaround the https://github.com/kubernetes-sigs/kubespray/issues/10670
nerdctl_image_pull_command: "{{ bin_dir }}/ctr -n k8s.io images pull{% if containerd_registries_mirrors is defined %} --hosts-dir {{ containerd_cfg_dir }}/certs.d{%- endif -%}"
crictl_image_info_command: "{{ bin_dir }}/crictl images --output json"
crictl_image_pull_command: "{{ bin_dir }}/crictl pull"

image_command_tool: "{%- if container_manager == 'containerd' -%}nerdctl{%- elif container_manager == 'crio' -%}crictl{%- else -%}{{ container_manager }}{%- endif -%}"
image_command_tool_on_localhost: "{{ image_command_tool }}"

image_pull_command: "{{ lookup('vars', image_command_tool + '_image_pull_command') }}"
image_info_command: "{{ lookup('vars', image_command_tool + '_image_info_command') }}"
image_pull_command_on_localhost: "{{ lookup('vars', image_command_tool_on_localhost + '_image_pull_command') }}"
image_info_command_on_localhost: "{{ lookup('vars', image_command_tool_on_localhost + '_image_info_command') }}"
17 changes: 0 additions & 17 deletions roles/kubespray-defaults/defaults/main/download.yml
Original file line number Diff line number Diff line change
Expand Up @@ -53,23 +53,6 @@ download_delegate: "{% if download_localhost %}localhost{% else %}{{ groups['kub
# Allow control the times of download retries for files and containers
download_retries: 4

# The docker_image_info_command might seems weird but we are using raw/endraw and `{{ `{{` }}` to manage the double jinja2 processing
docker_image_pull_command: "{{ docker_bin_dir }}/docker pull"
docker_image_info_command: "{{ docker_bin_dir }}/docker images -q | xargs -i {{ '{{' }} docker_bin_dir }}/docker inspect -f {% raw %}'{{ '{{' }} if .RepoTags }}{{ '{{' }} join .RepoTags \",\" }}{{ '{{' }} end }}{{ '{{' }} if .RepoDigests }},{{ '{{' }} join .RepoDigests \",\" }}{{ '{{' }} end }}' {% endraw %} {} | tr '\n' ','"
nerdctl_image_info_command: "{{ bin_dir }}/nerdctl -n k8s.io images --format '{% raw %}{{ .Repository }}:{{ .Tag }}{% endraw %}' 2>/dev/null | grep -v ^:$ | tr '\n' ','"
# Using the ctr instead of nerdctl to workdaround the https://github.com/kubernetes-sigs/kubespray/issues/10670
nerdctl_image_pull_command: "{{ bin_dir }}/ctr -n k8s.io images pull{% if containerd_registries_mirrors is defined %} --hosts-dir {{ containerd_cfg_dir }}/certs.d{%- endif -%}"
crictl_image_info_command: "{{ bin_dir }}/crictl images --verbose | awk -F ': ' '/RepoTags|RepoDigests/ {print $2}' | tr '\n' ','"
crictl_image_pull_command: "{{ bin_dir }}/crictl pull"

image_command_tool: "{%- if container_manager == 'containerd' -%}nerdctl{%- elif container_manager == 'crio' -%}crictl{%- else -%}{{ container_manager }}{%- endif -%}"
image_command_tool_on_localhost: "{{ image_command_tool }}"

image_pull_command: "{{ lookup('vars', image_command_tool + '_image_pull_command') }}"
image_info_command: "{{ lookup('vars', image_command_tool + '_image_info_command') }}"
image_pull_command_on_localhost: "{{ lookup('vars', image_command_tool_on_localhost + '_image_pull_command') }}"
image_info_command_on_localhost: "{{ lookup('vars', image_command_tool_on_localhost + '_image_info_command') }}"

# Arch of Docker images and needed packages
image_arch: "{{ host_architecture | default('amd64') }}"

Expand Down