Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"wait_for_lease = true" does not take effect #1091

Open
SJFCS opened this issue Aug 14, 2024 · 8 comments
Open

"wait_for_lease = true" does not take effect #1091

SJFCS opened this issue Aug 14, 2024 · 8 comments
Assignees

Comments

@SJFCS
Copy link

SJFCS commented Aug 14, 2024

System Information

Linux distribution

Archlinux

Terraform version

terraform -v
Terraform v1.9.4
on linux_amd64

Provider and libvirt versions

terraform-provider-libvirt -version
0.7.6

Description of Issue/Question

I use the following simple configuration. It installs qemu-ga through cloud-init.
When using terraform-provider-libvirt 0.7.6 version, "qemu_agent = true", "wait_for_lease = true" will not wait for qemu-ga to obtain the IP, prompting "Error: couldn't retrieve IP address". Only after changing the version to 0.7.1, try terraform init -upgrade, and then apply, it will wait for qemu-ga to obtain the IP.

Please excuse my poor English.

Setup

this is my main.tf

terraform {
  required_version = ">= 0.13"
  required_providers {
    libvirt = {
      source  = "dmacvicar/libvirt"
      version = "0.7.6"
    }
  }
}

provider "libvirt" {
  uri = "qemu:///system"
}

data "template_file" "user_data" {
  template = file("${path.module}/cloud_init/cloud_init.yml")
}

data "template_file" "network_config" {
  template = file("${path.module}/cloud_init/network_config.yml")
}

resource "libvirt_cloudinit_disk" "cloudinit" {
  name           = "cloudinit.iso"
  user_data      = data.template_file.user_data.rendered
  network_config = data.template_file.network_config.rendered
  pool           = "default"
}

resource "libvirt_volume" "debian9-qcow2" {
  name   = "debian9-qcow2"
  pool   = "default"
  source = "./ubuntu-24.04-server-cloudimg-amd64.img"
}

// set boot order hd, network
resource "libvirt_domain" "domain-debian9-qcow2" {
  name       = "debian9"
  memory     = "1024"
  vcpu       = 1
  qemu_agent = true
  cloudinit  = libvirt_cloudinit_disk.cloudinit.id

  network_interface {
    bridge         = "br0"
    wait_for_lease = true
  }

  boot_device {
    dev = ["hd", "network"]
  }

  disk {
    volume_id = libvirt_volume.debian9-qcow2.id
  }

  graphics {
    type        = "spice"
    listen_type = "address"
    autoport    = true
  }
  provisioner "remote-exec" {
    inline = [
      <<-EOF
        sudo apt-get update 
        sudo apt-get install nginx -y
        EOF
    ]
  }
  connection {
    type = "ssh"
    user = "ubuntu"
    host = self.network_interface[0].addresses[0] 
    private_key = file("~/.ssh/id_ed25519")
    timeout = "2m"
  }
}

this is cloud_init.yml

#cloud-config

bootcmd:
  - echo "This is a boot command"
runcmd:
  - [sh, -xc, "echo $(date) ': hello world!'"]
  - sudo apt-get update 
  - sudo apt-get install qemu-guest-agent -y
  - sudo systemctl enable --now qemu-guest-agent.service
ssh_pwauth: true
disable_root: false
users:
  - name: root
    plain_text_passwd: 'password'
    lock_passwd: false
  - name: ubuntu
    sudo: ALL=(ALL) NOPASSWD:ALL
    groups: users, admin
    home: /home/ubuntu
    shell: /bin/bash
    lock_passwd: false
    ssh-authorized-keys:
      - ssh-ed25519 Axxx5 [email protected]

network_config.yml

version: 2
ethernets:
  ens3:
    dhcp4: true

Steps to Reproduce Issue

0.7.6 is doesn't work step:

  1. terraform init
  2. TF_LOG=DEBUG terraform apply -auto-approve
    when use 0.7.6 debug:
          </graphics>
          <rng model="virtio">
              <backend model="random">/dev/urandom</backend>
          </rng>
      </devices>
  </domain>: timestamp="2024-08-14T23:28:10.086+0800"
2024-08-14T23:28:10.435+0800 [INFO]  provider.terraform-provider-libvirt_v0.7.6: 2024/08/14 23:28:10 [INFO] Domain ID: 1e643687-5914-469d-b5c8-356c5dc65790: timestamp="2024-08-14T23:28:10.435+0800"
2024-08-14T23:28:10.435+0800 [INFO]  provider.terraform-provider-libvirt_v0.7.6: 2024/08/14 23:28:10 [DEBUG] Waiting for state to become: [all-addresses-obtained]: timestamp="2024-08-14T23:28:10.435+0800"
2024-08-14T23:28:15.441+0800 [INFO]  provider.terraform-provider-libvirt_v0.7.6: 2024/08/14 23:28:15 [DEBUG] waiting for network address for iface=52:54:00:16:93:28: timestamp="2024-08-14T23:28:15.440+0800"
2024-08-14T23:28:15.441+0800 [INFO]  provider.terraform-provider-libvirt_v0.7.6: 2024/08/14 23:28:15 [DEBUG] qemu-agent used to query interface info: timestamp="2024-08-14T23:28:15.441+0800"
2024-08-14T23:28:15.443+0800 [ERROR] provider.terraform-provider-libvirt_v0.7.6: Response contains error diagnostic: diagnostic_severity=ERROR tf_proto_version=5.3 tf_provider_addr=provider @caller=github.com/hashicorp/[email protected]/tfprotov5/internal/diag/diagnostics.go:55 @module=sdk.proto tf_req_id=3443beee-8402-aa9f-8e77-364a3bd03a5e tf_resource_type=libvirt_domain tf_rpc=ApplyResourceChange diagnostic_detail=""
  diagnostic_summary=
  | couldn't retrieve IP address of domain id: 1e643687-5914-469d-b5c8-356c5dc65790. Please check following: 
  | 1) is the domain running properly? 
  | 2) has the network interface an IP address? 
  | 3) Networking issues on your libvirt setup? 
  |  4) is DHCP enabled on this Domain's network? 
  | 5) if you use bridge network, the domain should have the pkg qemu-agent installed 
  | IMPORTANT: This error is not a terraform libvirt-provider error, but an error caused by your KVM/libvirt infrastructure configuration/setup 

0.7.1 is work step:

  1. just change this:
    libvirt = {
      source  = "dmacvicar/libvirt"
      version = "0.7.1"
    }
  1. terraform init -upgrade
  2. TF_LOG=DEBUG terraform apply -auto-approve

when use 0.7.1 debug:

2024-08-14T23:26:07.000+0800 [INFO]  provider.terraform-provider-libvirt_v0.7.1: 2024/08/14 23:26:07 [DEBUG] waiting for network address for iface=52:54:00:7E:A5:63: timestamp="2024-08-14T23:26:07.000+0800"
2024-08-14T23:26:07.000+0800 [INFO]  provider.terraform-provider-libvirt_v0.7.1: 2024/08/14 23:26:07 [DEBUG] qemu-agent used to query interface info: timestamp="2024-08-14T23:26:07.000+0800"
2024-08-14T23:26:07.001+0800 [INFO]  provider.terraform-provider-libvirt_v0.7.1: 2024/08/14 23:26:07 [DEBUG] Interfaces info obtained with libvirt API:
([]libvirt.DomainInterface) <nil>: timestamp="2024-08-14T23:26:07.001+0800"
2024-08-14T23:26:07.001+0800 [INFO]  provider.terraform-provider-libvirt_v0.7.1: 2024/08/14 23:26:07 [DEBUG] ifaces with addresses: []: timestamp="2024-08-14T23:26:07.001+0800"
2024-08-14T23:26:07.001+0800 [INFO]  provider.terraform-provider-libvirt_v0.7.1: 2024/08/14 23:26:07 [DEBUG] 52:54:00:7E:A5:63 doesn't have IP address(es) yet...: timestamp="2024-08-14T23:26:07.001+0800"
2024-08-14T23:26:07.001+0800 [INFO]  provider.terraform-provider-libvirt_v0.7.1: 2024/08/14 23:26:07 [DEBUG] IP address not found for iface=52:54:00:7E:A5:63: will try in a while: timestamp="2024-08-14T23:26:07.001+0800"
2024-08-14T23:26:07.001+0800 [INFO]  provider.terraform-provider-libvirt_v0.7.1: 2024/08/14 23:26:07 [TRACE] Waiting 10s before next try: timestamp="2024-08-14T23:26:07.001+0800"
libvirt_domain.domain-ubuntu: Still creating... [40s elapsed]
2024-08-14T23:26:17.010+0800 [INFO]  provider.terraform-provider-libvirt_v0.7.1: 2024/08/14 23:26:17 [DEBUG] waiting for network address for iface=52:54:00:7E:A5:63: timestamp="2024-08-14T23:26:17.010+0800"
2024-08-14T23:26:17.010+0800 [INFO]  provider.terraform-provider-libvirt_v0.7.1: 2024/08/14 23:26:17 [DEBUG] qemu-agent used to query interface info: timestamp="2024-08-14T23:26:17.010+0800"
2024-08-14T23:26:17.013+0800 [INFO]  provider.terraform-provider-libvirt_v0.7.1: 2024/08/14 23:26:17 [DEBUG] Interfaces info obtained with libvirt API:
([]libvirt.DomainInterface) (len=2 cap=2) {

(Include debug logs if possible and relevant).


Additional information:

Do you have SELinux or Apparmor/Firewall enabled? Some special configuration? NO

@scabala
Copy link
Contributor

scabala commented Sep 2, 2024

Hello,
could you try to get an specify wait_For_lease using an image that already has qemu-guest-agent installed? I had successfully get IP address from VM when doing so.

@SJFCS
Copy link
Author

SJFCS commented Sep 3, 2024

Hello, could you try to get an specify wait_For_lease using an image that already has qemu-guest-agent installed? I had successfully get IP address from VM when doing so.

Thank you for the method you provided
I haven't tried to use an image with qemu-guest-agent already installed because I want qemu-guest-agent to be installed automatically during the cloudinit phase, which was possible in previous versions but will not work in the new version

@scabala
Copy link
Contributor

scabala commented Sep 3, 2024

I'll try to take a look and see if I can find anything changed that might cause it between those two versions.

@scabala
Copy link
Contributor

scabala commented Sep 18, 2024

I couldn't find anything particular between those versions. Also, I don't have bridged network in my setup and it's hard for me to create it so I used NAT-ed one and I couldn't reproduce it.

@SJFCS could you check if you can reproduce it in different network types? NAT-ed and routed for example?

EDIT: forget what I wrote, I can reproduce it, just used wrong image before 🤦

I'll try to bisect and see where problem lies

@scabala
Copy link
Contributor

scabala commented Sep 20, 2024

Okay, more debugging later: I cannot reproduce it - previously I had problems with cloud-init. I think it might be related to cloud-init itself rather than to provider.

Either way, I have consisten behavior between 0.7.6 and 0.7.1 - it's either failing if qemu-guest-agent is not installed and started or it is running fine otherwise.

@SJFCS
Copy link
Author

SJFCS commented Sep 21, 2024

I couldn't find anything particular between those versions. Also, I don't have bridged network in my setup and it's hard for me to create it so I used NAT-ed one and I couldn't reproduce it.

@SJFCS could you check if you can reproduce it in different network types? NAT-ed and routed for example?

EDIT: forget what I wrote, I can reproduce it, just used wrong image before 🤦

I'll try to bisect and see where problem lies

The network configuration is the same, I think it has nothing to do with this

@SJFCS
Copy link
Author

SJFCS commented Sep 21, 2024

Okay, more debugging later: I cannot reproduce it - previously I had problems with cloud-init. I think it might be related to cloud-init itself rather than to provider.

Either way, I have consisten behavior between 0.7.6 and 0.7.1 - it's either failing if qemu-guest-agent is not installed and started or it is running fine otherwise.

Okay, thanks for the troubleshooting, but I did only change the provider version number while keeping the configuration unchanged.

@scabala
Copy link
Contributor

scabala commented Sep 21, 2024

Do you have cloud-init logs for both scenarios?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants