Attempting to run agent as pod fails #4

erik-hallros · 2024-10-21T20:12:59Z

When attempting to run a kasm agent as a kubernetes pod it seems to fail when executing the python application.

Is the container name hardcoded to kasm_agent in the python code?

To make this work up to this point i had to change the node runtime to docker and copy / replicate the behaviour of the docker agent created in the single host setup version by analyzing docker inspect output.

k8s:~$ sudo kubectl get nodes -o wide
NAME   STATUS   ROLES                  AGE    VERSION        INTERNAL-IP      EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION   CONTAINER-RUNTIME
k8s    Ready    control-plane,master   2d2h   v1.30.5+k3s1   192.168.68.119   <none>        Ubuntu 24.04.1 LTS   6.8.0-1017-aws   docker://27.3.1

Pod Log

k8s:~$ sudo kubectl logs kasm-agent
Executing /usr/bin/kasm_agent.so
Received config /host/kasm/current/conf/app/agent.app.config.yaml
2024-10-22 19:36:02,549 [INFO] __main__.handler: Starting Server On Port 4444
2024-10-22 19:36:02,550 [DEBUG] __main__.handler: Sending manager request (https://kasm.cluster.local:443/manager_api/api/v1/agent_config)
2024-10-22 19:36:02,557 [DEBUG] __main__.handler: {}
2024-10-22 19:36:02,658 [DEBUG] __main__.handler: No GPU filtering defined by user
2024-10-22 19:36:02,670 [DEBUG] __main__.handler: Rebuilding file Mappings
2024-10-22 19:36:02,672 [DEBUG] __main__.handler: Current file mappings: {}
2024-10-22 19:36:02,674 [DEBUG] __main__.handler: Provisioner initialized with 0 GPU(s)
Traceback (most recent call last):
  File "docker/api/client.py", line 265, in _raise_for_status
  File "requests/models.py", line 1021, in raise_for_status
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http+docker://localhost/v1.47/containers/kasm_agent/json

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "__init__.py", line 933, in <module>
  File "__init__.py", line 832, in start
  File "__init__.py", line 783, in __init__
  File "provision.py", line 143, in __init__
  File "docker/models/containers.py", line 951, in get
  File "docker/utils/decorators.py", line 19, in wrapped
  File "docker/api/container.py", line 792, in inspect_container
  File "docker/api/client.py", line 271, in _result
  File "docker/api/client.py", line 267, in _raise_for_status
  File "docker/errors.py", line 39, in create_api_error_from_http_exception
docker.errors.NotFound: 404 Client Error for http+docker://localhost/v1.47/containers/kasm_agent/json: Not Found ("No such container: kasm_agent")
[8] Failed to execute script '__init__' due to unhandled exception!

docker ps

k8s:~$ sudo docker ps
CONTAINER ID   IMAGE                                     COMMAND                  CREATED          STATUS          PORTS     NAMES
a55efa0e6270   rancher/mirrored-pause:3.6                "/pause"                 24 minutes ago   Up 24 minutes             k8s_POD_kasm-agent_default_63bb038e-361d-479b-be76-8baca2c739a6_0

k8s:~$ sudo kubectl get pods -o wide
NAME         READY   STATUS             RESTARTS        AGE   IP            NODE   NOMINATED NODE   READINESS GATES
kasm-agent   0/1     CrashLoopBackOff   9 (2m47s ago)   23m   10.42.0.236   k8s    <none>           <none>

The container name is not kasm_agent when running as pod with rancher, and underscore is not allowed in pod name when configuring the yaml file for kubectl.

agent.app.config.yaml

agent:
  auto_generate_kasm_docker_networks: false
  default_host_key: 1234
  disk_usage_limit: 0.9
  docker_async_script_timeout: 900
  docker_port_listen_addr: localhost
  docker_script_timeout: 180
  forward_logs_to_manager: true
  heartbeat_interval: 30000
  images_interval: 3600
  log_container_stats: false
  log_heartbeat_data: false
  max_concurrent_docker_pulls: 2
  nginx_container_dir: /etc/nginx/conf.d/containers.d/
  persist_config_interval: 100000
  persist_config_updates: true
  port: 4444
  provider: hardware
  public_hostname: agent1.cluster.local
  public_port: 443
  remove_failed_containers: true
  retention_period: '24'
  server_id: dfdd1472-dcaf-41c8-beda-1b7bf44b9d40
  starting_nginx_port: 5971
  type: host
  validate_images: true
logging:
  agent:
    formatters:
      pythonjsonlogger:
        (): pythonjsonlogger.jsonlogger.JsonFormatter
        fmt: '%(asctime) %(name) %(processName) %(filename)  %(funcName) %(levelname)
          %(lineno) %(module) %(threadName) %(message)'
        timestamp: true
      standard:
        format: '%(asctime)s [%(levelname)s] %(name)s: %(message)s'
    handlers:
      stream:
        class: logging.StreamHandler
        formatter: standard
        level: DEBUG
      syslog:
        class: logging.handlers.SysLogHandler
        formatter: pythonjsonlogger
        level: DEBUG
    loggers:
      ? ''
      : handlers:
        - stream
        - syslog
        level: DEBUG
        propagate: true
      tornado:
        level: INFO
    version: 1
manager:
  client_cert: /srv/provision_agent/client_cert.pem
  config_path: /manager_api/api/v1/agent_config
  heartbeat_path: /manager_api/api/v1/heartbeat
  hostnames:
  - kasm.cluster.local
  - $request_host$
  images_path: /manager_api/api/v1/images
  public_port: 443
  scheme: https
  server_cert: /srv/provision_agent/server_cert.pem
  token: nBa83Xpz1LzSvZ1YhWHBjFMua4

kasmagent.yaml

apiVersion: v1
kind: Pod
metadata:
  name: kasm-agent
  labels:
    app: kasm-agent
spec:
  containers:
  - name: kasm-agent
    image: kasmweb/agent:1.16.0
    imagePullPolicy: IfNotPresent
    command: ["/bin/sh"]
    args: ["-c", "echo 192.168.68.119 kasm.cluster.local >> /etc/hosts && cp /host/certs/*.crt /etc/ssl/certs/ && /usr/bin/startup.sh -s /usr/bin/kasm_agent.so -c /host/kasm/current/conf/app/agent.app.config.yaml"]
    ports:
    - containerPort: 4444
    env:
    - name: SOCKET_PATH
      value: /var/run/docker.sock
    - name: DEFAULT_MANAGER_TOKEN
      value: "nBa83Xpz1LzSvZ1YhWHBjFMua4"
    - name: MANAGER_HOSTNAME
      value: kasm.cluster.local
    #- name: PATH
    #  value: "/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
    - name: KASM_BUILD_ID
      value: "1.16.0.a1d5b7"
    - name: GAMEPAD_BUILD_ID
      value: "151399e32c159b590a31a3d20b883af7d9104643"
    volumeMounts:
    - name: ssl-cert-volume
      mountPath: /host/certs
    - name: kasm-config
      mountPath: /host/kasm/current
      #subPath: 1.16.0
    - name: nginx-config
      mountPath: /etc/nginx/conf.d
      #subPath: conf.d
    - name: docker-sock
      mountPath: /var/run/docker.sock
    livenessProbe:
      httpGet:
        path: /__healthcheck
        port: 4444
      initialDelaySeconds: 10
      periodSeconds: 10
      timeoutSeconds: 3
    readinessProbe:
      httpGet:
        path: /__healthcheck
        port: 4444
      initialDelaySeconds: 10
      periodSeconds: 10
      timeoutSeconds: 3
    resources:
      limits:
        memory: "500Mi"
        cpu: "500m"
      requests:
        memory: "200Mi"
        cpu: "200m"
  restartPolicy: Always
  volumes:
  - name: ssl-cert-volume
    hostPath:
      path: /etc/ssl/certs
      type: Directory
  - name: kasm-config
    hostPath:
      path: /opt/kasm/1.16.0
  - name: nginx-config
    hostPath:
      path: /opt/kasm/1.16.0/conf/nginx
  - name: docker-sock
    hostPath:
      path: /var/run/docker.sock

The text was updated successfully, but these errors were encountered:

jimmy-ungerman · 2024-10-25T19:27:19Z

@erik-hallros curious if you ever got this working? I'm trying to deploy an agent inside the same cluster as well

erik-hallros · 2024-10-25T21:02:52Z

@jimmy-ungerman i had some success to get it running further by writing a custom middleware.sock that intercepts the kasm_agent path and instead fetches the container based on label instead of kasm_agent name, which works.

But the more i dig into this the more i feel like i had the wrong expectations on how the agent worked, the way the agent application is written it seems to be meant to have only one agent on each host.
Meaning i would need to have the pod > "host"-container > agent image instead of pod > agent image

Because what the agent image appears to be doing is to manage the host through the docker runtime api, it creates additional containers on the parent and mounts different directories from the host on each container instance.

My assumption to begin with was that the agent image on dockerhub was a standalone agent container.

erik-hallros · 2024-10-26T23:07:46Z

@jimmy-ungerman I've made a container version that works but i doubt its best practices or remotely stable, done nothing in networking yet so theres stuff left to do to make it functional. Feel free to check it out or contribute
https://github.com/erik-hallros/kasm-agent

mmcclaskey · 2024-10-28T12:08:54Z

K8s is currently only supported for the infrastructure of Kasm, not for the end-user desktop containers. For that we do support kubevirt auto-scaling, which uses full stack VMs for agents. There are people experimenting with DinD type setup for running agents in K8s. Ultimately we really need to create a K8s native agent that can create the end-user pods directly, rather than relying on DinD, but we are not there yet.

jimmy-ungerman · 2024-10-28T16:22:47Z

@mmcclaskey Makes sense, I saw the blog post on friday that explicitly called out that agents aren't currently supported. Ideally, we can get a kubernetes agent that just spins up a new pod with the selected desktop container from a user that they can use

erik-hallros changed the title ~~Attempting to run agent as pod returns No such file or directory in python executables~~ Attempting to run agent as pod fails Oct 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Attempting to run agent as pod fails #4

Attempting to run agent as pod fails #4

erik-hallros commented Oct 21, 2024 •

edited

Loading

jimmy-ungerman commented Oct 25, 2024

erik-hallros commented Oct 25, 2024 •

edited

Loading

erik-hallros commented Oct 26, 2024 •

edited

Loading

mmcclaskey commented Oct 28, 2024

jimmy-ungerman commented Oct 28, 2024

Attempting to run agent as pod fails #4

Attempting to run agent as pod fails #4

Comments

erik-hallros commented Oct 21, 2024 • edited Loading

jimmy-ungerman commented Oct 25, 2024

erik-hallros commented Oct 25, 2024 • edited Loading

erik-hallros commented Oct 26, 2024 • edited Loading

mmcclaskey commented Oct 28, 2024

jimmy-ungerman commented Oct 28, 2024

erik-hallros commented Oct 21, 2024 •

edited

Loading

erik-hallros commented Oct 25, 2024 •

edited

Loading

erik-hallros commented Oct 26, 2024 •

edited

Loading