forked from xapi-project/xen-api
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request xapi-project#5265 from xcp-ng/gtn-vm-migration-wal…
…kthrough Add VM migration walkthrough
- Loading branch information
Showing
1 changed file
with
178 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,178 @@ | ||
--- | ||
title: 'Walkthrough: Migrating a VM' | ||
--- | ||
|
||
A XenAPI client wishes to migrate a VM from one host to another within | ||
the same pool. | ||
|
||
The client will issue a command to migrate the VM and it will be dispatched | ||
by the autogenerated `dispatch_call` function from **xapi/server.ml**. For | ||
more information about the generated functions you can have a look to | ||
[XAPI IDL model](https://github.com/xapi-project/xen-api/tree/master/ocaml/idl/ocaml_backend). | ||
|
||
The command will trigger the operation | ||
[VM_migrate](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/lib/xenops_server.ml#L2572) | ||
that has low level operations performed by the backend. These atomics operations | ||
that we will describe in the documentation are: | ||
|
||
- VM.restore | ||
- VM.rename | ||
- VBD.set_active | ||
- VBD.plug | ||
- VIF.set_active | ||
- VGPU.set_active | ||
- VM.create_device_model | ||
- PCI.plug | ||
- VM.set_domain_action_request | ||
|
||
The command have serveral parameters such as: should it be ran asynchronously, | ||
should it be forwared to another host, how arguments should be marshalled and | ||
so on. A new thread is created by [xapi/server_helpers.ml](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xapi/server_helpers.ml#L55) | ||
to handle the command asynchronously. At this point the helper also check if | ||
the command should be passed to the [message forwarding](https://github.com/xapi-project/xen-api/blob/master/ocaml/xapi/message_forwarding.ml) | ||
layer in order to be executed on another host (the destination) or locally if | ||
we are already at the right place. | ||
|
||
It will finally reach [xapi/api_server.ml](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xapi/api_server.ml#L242) that will take the action | ||
of posted a command to the message broker [message switch](https://github.com/xapi-project/xen-api/tree/master/ocaml/message-switch). | ||
It is a JSON-RPC HTTP request sends on a Unix socket to communicate between some | ||
XAPI daemons. In the case of the migration this message sends by **XAPI** will be | ||
consumed by the [xenopsd](https://github.com/xapi-project/xen-api/tree/master/ocaml/xenopsd) | ||
daemon that will do the job of migrating the VM. | ||
|
||
# The migration of the VM | ||
|
||
The migration is an asynchronous task and a thread is created to handle this task. | ||
The tasks's reference is returned to the client, which can then check | ||
its status until completion. | ||
|
||
As we see in the introduction the [xenopsd](https://github.com/xapi-project/xen-api/tree/master/ocaml/xenopsd) | ||
daemon will pop the operation | ||
[VM_migrate](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/lib/xenops_server.ml#L2572) | ||
from the message broker. | ||
|
||
Only one backend is know available that interacts with libxc, libxenguest | ||
and xenstore. It is the [xc backend](https://github.com/xapi-project/xen-api/tree/master/ocaml/xenopsd/xc). | ||
|
||
The entities that need to be migrated are: *VDI*, *VIF*, *VGPU* and *PCI* components. | ||
|
||
During the migration process the destination domain will be built with the same | ||
uuid than the original VM but the last part of the UUID will be | ||
`XXXXXXXX-XXXX-XXXX-XXXX-000000000001`. The original domain will be removed using | ||
`XXXXXXXX-XXXX-XXXX-XXXX-000000000000`. | ||
|
||
There are some points called *hooks* at which `xenopsd` can execute some script. | ||
Before starting a migration a command is send to the original domain to execute | ||
a pre migrate script if it exists. | ||
|
||
Before starting the migration a command is sent to Qemu using the Qemu Machine Protocol (QMP) | ||
to check that the domain can be suspended (see [xenopsd/xc/device_common.ml](https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/device_common.ml)). | ||
After checking with Qemu that the VM is suspendable we can start the migration. | ||
|
||
## Importing metadata | ||
|
||
As for *hooks*, commands to source domain are sent using [stunnel](https://github.com/xapi-project/xen-api/tree/master/ocaml/libs/stunnel) a daemon which | ||
is used as a wrapper to manage SSL encryption communication between two hosts on the same | ||
pool. To import metada an XML RPC command is sent to the original domain. | ||
|
||
Once imported it will give us a reference id and will allow to build the new domain | ||
on the destination using the temporary VM uuid `XXXXXXXX-XXXX-XXXX-XXXX-000000000001` | ||
where `XXX...` is the reference id of the original VM. | ||
|
||
## Setting memory | ||
|
||
One of the first thing to do is to setup the memory. The backend will check that there | ||
is no ballooning operation in progress. At this point the migration can fail if a | ||
ballooning operation is in progress and takes too much time. | ||
|
||
Once memory checked the daemon will get the state of the VM (running, halted, ...) and | ||
information about the VM are retrieve by the backend like the maximum memory the domain | ||
can consume but also information about quotas for example. | ||
Information are retrieve by the backend from xenstore. | ||
|
||
Once this is complete, we can restore VIF and create the domain. | ||
|
||
The synchronisation of the memory is the first point of synchronisation and everythin | ||
is ready for VM migration. | ||
|
||
## VM Migration | ||
|
||
After receiving memory we can set up the destination domain. If we have a vGPU we need to kick | ||
off its migration process. We will need to wait the acknowledge that indicates that the entry | ||
for the GPU has been well initialized. before starting the main VM migration. | ||
|
||
Their is a mechanism of handshake for synchronizing between the source and the | ||
destination. Using the handshake protocol the receiver inform the sender of the | ||
request that everything has been setup and ready to save/restore. | ||
|
||
### VM restore | ||
|
||
VM restore is a low level atomic operation [VM.restore](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L2684). | ||
This operation is represented by a function call to [backend](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/domain.ml#L1540). | ||
It uses **Xenguest**, a low-level utility from XAPI toolstack, to interact with the Xen hypervisor | ||
and libxc for sending a request of migration to the **emu-manager**. | ||
|
||
After sending the request results coming from **emu-manager** are collected | ||
by the main thread. It blocks until results are received. | ||
|
||
During the live migration, **emu-manager** helps in ensuring the correct state | ||
transitions for the devices and handling the message passing for the VM as | ||
it's moved between hosts. This includes making sure that the state of the | ||
VM's virtual devices, like disks or network interfaces, is correctly moved over. | ||
|
||
### VM renaming | ||
|
||
Once all operations are done we can rename the VM on the target from its temporary | ||
name to its real UUID. This operation is another low level atomic one | ||
[VM.rename](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L1667) | ||
that will take care of updating the xenstore on the destination. | ||
|
||
The next step is the restauration of devices and unpause the domain. | ||
|
||
### Restoring remaining devices | ||
|
||
Restoring devices starts by activating VBD using the low level atomic operation | ||
[VBD.set_active](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L3674). It is an update of Xenstore. VBDs that are read-write must | ||
be plugged before read-only ones. Once activated the low level atomic operation | ||
[VBD.plug](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L3721) | ||
is called. VDI are attached and activate. | ||
|
||
Next devices are VIFs that are set as active [VIF.set_active](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L4296) and plug [VIF.plug](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L4394). | ||
If there are VGPUs we will set them as active now using the atomic [VGPU.set_active](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L3490). | ||
|
||
We are almost done. The next step is to create the device model | ||
|
||
#### create device model | ||
|
||
Create device model is done by using the atomic operation [VM.create_device_model](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L2375). This | ||
will configure **qemu-dm** and started. This allow to manage PCI devices. | ||
|
||
#### PCI plug | ||
|
||
[PCI.plug](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L3399) | ||
is executed by the backend. It plugs a PCI device and advertise it to QEMU if this option is set. It is | ||
the case for NVIDIA SR-IOV vGPUS. | ||
|
||
At this point devices have been restored. The new domain is considered survivable. We can | ||
unpause the domain and performs last actions | ||
|
||
### Unpause and done | ||
|
||
Unpause is done by managing the state of the domain using bindings to [xenctrl](https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=tools/libs/ctrl/xc_domain.c;h=f2d9d14b4d9f24553fa766c5dcb289f88d684bb0;hb=HEAD#l76). | ||
Once hypervisor has unpaused the domain some actions can be requested using [VM.set_domain_action_request](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L3172). | ||
It is a path in xenstore. By default no action is done but a reboot can be for example | ||
initiated. | ||
|
||
Previously we spoke about some points called *hooks* at which `xenopsd` can execute some script. There | ||
is also a hook to run a post migrate script. After the execution of the script if there is one | ||
the migration is almost done. The last step is a handskake to seal the success of the migration | ||
and the old VM can now be cleaned. | ||
|
||
# Links | ||
|
||
Some links are old but even if many changes occured they are relevant for a global understanding | ||
of the XAPI toolstack. | ||
|
||
- [XAPI architecture](https://xapi-project.github.io/xapi/architecture.html) | ||
- [XAPI dispatcher](https://wiki.xenproject.org/wiki/XAPI_Dispatch) | ||
- [Xenopsd architecture](https://xapi-project.github.io/xenopsd/architecture.html) |