Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check the VM status using virtctl instead of the SSH protocol #916

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ebattat
Copy link
Collaborator

@ebattat ebattat commented Oct 10, 2024

Type of change

Note: Fill x in []

  • bug
  • enhancement
  • documentation
  • dependencies

Description

Change the VM login method from SSH protocol to Virtctl, as this will provide clearer error messages.
Additionally, Since the VM could be in the middle of a migration, add a retry mechanism that makes 5 attempts with a 10-second pause between each retry.

For security reasons, all pull requests need to be approved first before running any automated CI

@ebattat ebattat added the bug Something isn't working label Oct 10, 2024
@ebattat ebattat self-assigned this Oct 10, 2024
@ebattat ebattat force-pushed the add_vm_ssh_retry branch 2 times, most recently from 039c08b to 545ae77 Compare October 10, 2024 14:52
@ebattat ebattat changed the title Add a VM retry mechanism in case of an error Add a VM retry mechanism using the virtctl protocol to verify SSH. Oct 10, 2024
benchmark_runner/common/oc/oc.py Outdated Show resolved Hide resolved
benchmark_runner/common/oc/oc.py Outdated Show resolved Hide resolved
benchmark_runner/common/oc/oc.py Outdated Show resolved Hide resolved
benchmark_runner/workloads/bootstorm_vm.py Outdated Show resolved Hide resolved
ssh_status = self._oc.get_vm_ssh_status(vm_name, node_ip, vm_node_port)
vm_ssh = 1 if ssh_status == 'True' else 0
ssh_status = self._oc.get_vm_virtctl_status(vm_name)
vm_ssh = 1 if ssh_status == 'True' else 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need this separately from ssh_status? You can always use int(vm_ssh) for the same purpose.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case of error we return the error message:

virtualmachineinstance.kubevirt.io "windows-vm-65265214-0" not found
kex_exchange_identification: Connection closed by remote host

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine, but why does vm_ssh need to be distinct from ssh_status?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see...it would be cleaner to return the boolean status and the message as two return values:

if ok:
    return (True, '')
else:
    return (False, 'This is why it failed')

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need 2 parameters:

  1. vm_ssh: True/False - if ssh passed successfully (Filtering in Grafana only error msg)
  2. ssh_status: error msg in case of error

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly; don't overload them.

Copy link
Collaborator Author

@ebattat ebattat Oct 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to keep it separately for Grafana dashboard
vm_ssh: int
ssh_status: str

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't you convert it only when you need to insert it into grafana, rather than carrying around two separate variables that could get out of sync?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RobertKrawitz I agree.
Use the 'virtctl_status' string variable (instead of 'ssh_status') with the value 'True' if successful, or an error message if it fails.
Additionally, add the 'virtctl_vm' integer value (instead of 'vm_ssh') only for Grafana.

@ebattat ebattat force-pushed the add_vm_ssh_retry branch 3 times, most recently from 599c028 to 1be1d2f Compare October 13, 2024 08:01
@ebattat ebattat changed the title Add a VM retry mechanism using the virtctl protocol to verify SSH. Check VM status by Virtctl instead of SSH protocol Oct 13, 2024
@ebattat ebattat changed the title Check VM status by Virtctl instead of SSH protocol Check the VM status using virtctl instead of the SSH protocol Oct 13, 2024
@ebattat ebattat added the ok-to-test PR ok to test label Oct 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working ok-to-test PR ok to test
Projects
Status: In progress
Development

Successfully merging this pull request may close these issues.

2 participants