Skip to content

Commit

Permalink
Merge pull request xapi-project#5265 from xcp-ng/gtn-vm-migration-wal…
Browse files Browse the repository at this point in the history
…kthrough

Add VM migration walkthrough
  • Loading branch information
robhoes authored Dec 4, 2023
2 parents c40d4c1 + 27cf212 commit 6ecc885
Showing 1 changed file with 178 additions and 0 deletions.
178 changes: 178 additions & 0 deletions doc/content/xenopsd/walkthroughs/VM.migrate.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
---
title: 'Walkthrough: Migrating a VM'
---

A XenAPI client wishes to migrate a VM from one host to another within
the same pool.

The client will issue a command to migrate the VM and it will be dispatched
by the autogenerated `dispatch_call` function from **xapi/server.ml**. For
more information about the generated functions you can have a look to
[XAPI IDL model](https://github.com/xapi-project/xen-api/tree/master/ocaml/idl/ocaml_backend).

The command will trigger the operation
[VM_migrate](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/lib/xenops_server.ml#L2572)
that has low level operations performed by the backend. These atomics operations
that we will describe in the documentation are:

- VM.restore
- VM.rename
- VBD.set_active
- VBD.plug
- VIF.set_active
- VGPU.set_active
- VM.create_device_model
- PCI.plug
- VM.set_domain_action_request

The command have serveral parameters such as: should it be ran asynchronously,
should it be forwared to another host, how arguments should be marshalled and
so on. A new thread is created by [xapi/server_helpers.ml](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xapi/server_helpers.ml#L55)
to handle the command asynchronously. At this point the helper also check if
the command should be passed to the [message forwarding](https://github.com/xapi-project/xen-api/blob/master/ocaml/xapi/message_forwarding.ml)
layer in order to be executed on another host (the destination) or locally if
we are already at the right place.

It will finally reach [xapi/api_server.ml](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xapi/api_server.ml#L242) that will take the action
of posted a command to the message broker [message switch](https://github.com/xapi-project/xen-api/tree/master/ocaml/message-switch).
It is a JSON-RPC HTTP request sends on a Unix socket to communicate between some
XAPI daemons. In the case of the migration this message sends by **XAPI** will be
consumed by the [xenopsd](https://github.com/xapi-project/xen-api/tree/master/ocaml/xenopsd)
daemon that will do the job of migrating the VM.

# The migration of the VM

The migration is an asynchronous task and a thread is created to handle this task.
The tasks's reference is returned to the client, which can then check
its status until completion.

As we see in the introduction the [xenopsd](https://github.com/xapi-project/xen-api/tree/master/ocaml/xenopsd)
daemon will pop the operation
[VM_migrate](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/lib/xenops_server.ml#L2572)
from the message broker.

Only one backend is know available that interacts with libxc, libxenguest
and xenstore. It is the [xc backend](https://github.com/xapi-project/xen-api/tree/master/ocaml/xenopsd/xc).

The entities that need to be migrated are: *VDI*, *VIF*, *VGPU* and *PCI* components.

During the migration process the destination domain will be built with the same
uuid than the original VM but the last part of the UUID will be
`XXXXXXXX-XXXX-XXXX-XXXX-000000000001`. The original domain will be removed using
`XXXXXXXX-XXXX-XXXX-XXXX-000000000000`.

There are some points called *hooks* at which `xenopsd` can execute some script.
Before starting a migration a command is send to the original domain to execute
a pre migrate script if it exists.

Before starting the migration a command is sent to Qemu using the Qemu Machine Protocol (QMP)
to check that the domain can be suspended (see [xenopsd/xc/device_common.ml](https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/device_common.ml)).
After checking with Qemu that the VM is suspendable we can start the migration.

## Importing metadata

As for *hooks*, commands to source domain are sent using [stunnel](https://github.com/xapi-project/xen-api/tree/master/ocaml/libs/stunnel) a daemon which
is used as a wrapper to manage SSL encryption communication between two hosts on the same
pool. To import metada an XML RPC command is sent to the original domain.

Once imported it will give us a reference id and will allow to build the new domain
on the destination using the temporary VM uuid `XXXXXXXX-XXXX-XXXX-XXXX-000000000001`
where `XXX...` is the reference id of the original VM.

## Setting memory

One of the first thing to do is to setup the memory. The backend will check that there
is no ballooning operation in progress. At this point the migration can fail if a
ballooning operation is in progress and takes too much time.

Once memory checked the daemon will get the state of the VM (running, halted, ...) and
information about the VM are retrieve by the backend like the maximum memory the domain
can consume but also information about quotas for example.
Information are retrieve by the backend from xenstore.

Once this is complete, we can restore VIF and create the domain.

The synchronisation of the memory is the first point of synchronisation and everythin
is ready for VM migration.

## VM Migration

After receiving memory we can set up the destination domain. If we have a vGPU we need to kick
off its migration process. We will need to wait the acknowledge that indicates that the entry
for the GPU has been well initialized. before starting the main VM migration.

Their is a mechanism of handshake for synchronizing between the source and the
destination. Using the handshake protocol the receiver inform the sender of the
request that everything has been setup and ready to save/restore.

### VM restore

VM restore is a low level atomic operation [VM.restore](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L2684).
This operation is represented by a function call to [backend](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/domain.ml#L1540).
It uses **Xenguest**, a low-level utility from XAPI toolstack, to interact with the Xen hypervisor
and libxc for sending a request of migration to the **emu-manager**.

After sending the request results coming from **emu-manager** are collected
by the main thread. It blocks until results are received.

During the live migration, **emu-manager** helps in ensuring the correct state
transitions for the devices and handling the message passing for the VM as
it's moved between hosts. This includes making sure that the state of the
VM's virtual devices, like disks or network interfaces, is correctly moved over.

### VM renaming

Once all operations are done we can rename the VM on the target from its temporary
name to its real UUID. This operation is another low level atomic one
[VM.rename](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L1667)
that will take care of updating the xenstore on the destination.

The next step is the restauration of devices and unpause the domain.

### Restoring remaining devices

Restoring devices starts by activating VBD using the low level atomic operation
[VBD.set_active](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L3674). It is an update of Xenstore. VBDs that are read-write must
be plugged before read-only ones. Once activated the low level atomic operation
[VBD.plug](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L3721)
is called. VDI are attached and activate.

Next devices are VIFs that are set as active [VIF.set_active](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L4296) and plug [VIF.plug](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L4394).
If there are VGPUs we will set them as active now using the atomic [VGPU.set_active](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L3490).

We are almost done. The next step is to create the device model

#### create device model

Create device model is done by using the atomic operation [VM.create_device_model](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L2375). This
will configure **qemu-dm** and started. This allow to manage PCI devices.

#### PCI plug

[PCI.plug](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L3399)
is executed by the backend. It plugs a PCI device and advertise it to QEMU if this option is set. It is
the case for NVIDIA SR-IOV vGPUS.

At this point devices have been restored. The new domain is considered survivable. We can
unpause the domain and performs last actions

### Unpause and done

Unpause is done by managing the state of the domain using bindings to [xenctrl](https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=tools/libs/ctrl/xc_domain.c;h=f2d9d14b4d9f24553fa766c5dcb289f88d684bb0;hb=HEAD#l76).
Once hypervisor has unpaused the domain some actions can be requested using [VM.set_domain_action_request](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L3172).
It is a path in xenstore. By default no action is done but a reboot can be for example
initiated.

Previously we spoke about some points called *hooks* at which `xenopsd` can execute some script. There
is also a hook to run a post migrate script. After the execution of the script if there is one
the migration is almost done. The last step is a handskake to seal the success of the migration
and the old VM can now be cleaned.

# Links

Some links are old but even if many changes occured they are relevant for a global understanding
of the XAPI toolstack.

- [XAPI architecture](https://xapi-project.github.io/xapi/architecture.html)
- [XAPI dispatcher](https://wiki.xenproject.org/wiki/XAPI_Dispatch)
- [Xenopsd architecture](https://xapi-project.github.io/xenopsd/architecture.html)

0 comments on commit 6ecc885

Please sign in to comment.