diff --git a/doc/content/xenopsd/walkthroughs/VM.migrate.md b/doc/content/xenopsd/walkthroughs/VM.migrate.md new file mode 100644 index 00000000000..080ebdb8edc --- /dev/null +++ b/doc/content/xenopsd/walkthroughs/VM.migrate.md @@ -0,0 +1,178 @@ +--- +title: 'Walkthrough: Migrating a VM' +--- + +A XenAPI client wishes to migrate a VM from one host to another within +the same pool. + +The client will issue a command to migrate the VM and it will be dispatched +by the autogenerated `dispatch_call` function from **xapi/server.ml**. For +more information about the generated functions you can have a look to +[XAPI IDL model](https://github.com/xapi-project/xen-api/tree/master/ocaml/idl/ocaml_backend). + +The command will trigger the operation +[VM_migrate](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/lib/xenops_server.ml#L2572) +that has low level operations performed by the backend. These atomics operations +that we will describe in the documentation are: + +- VM.restore +- VM.rename +- VBD.set_active +- VBD.plug +- VIF.set_active +- VGPU.set_active +- VM.create_device_model +- PCI.plug +- VM.set_domain_action_request + +The command have serveral parameters such as: should it be ran asynchronously, +should it be forwared to another host, how arguments should be marshalled and +so on. A new thread is created by [xapi/server_helpers.ml](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xapi/server_helpers.ml#L55) +to handle the command asynchronously. At this point the helper also check if +the command should be passed to the [message forwarding](https://github.com/xapi-project/xen-api/blob/master/ocaml/xapi/message_forwarding.ml) +layer in order to be executed on another host (the destination) or locally if +we are already at the right place. + +It will finally reach [xapi/api_server.ml](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xapi/api_server.ml#L242) that will take the action +of posted a command to the message broker [message switch](https://github.com/xapi-project/xen-api/tree/master/ocaml/message-switch). +It is a JSON-RPC HTTP request sends on a Unix socket to communicate between some +XAPI daemons. In the case of the migration this message sends by **XAPI** will be +consumed by the [xenopsd](https://github.com/xapi-project/xen-api/tree/master/ocaml/xenopsd) +daemon that will do the job of migrating the VM. + +# The migration of the VM + +The migration is an asynchronous task and a thread is created to handle this task. +The tasks's reference is returned to the client, which can then check +its status until completion. + +As we see in the introduction the [xenopsd](https://github.com/xapi-project/xen-api/tree/master/ocaml/xenopsd) +daemon will pop the operation +[VM_migrate](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/lib/xenops_server.ml#L2572) +from the message broker. + +Only one backend is know available that interacts with libxc, libxenguest +and xenstore. It is the [xc backend](https://github.com/xapi-project/xen-api/tree/master/ocaml/xenopsd/xc). + +The entities that need to be migrated are: *VDI*, *VIF*, *VGPU* and *PCI* components. + +During the migration process the destination domain will be built with the same +uuid than the original VM but the last part of the UUID will be +`XXXXXXXX-XXXX-XXXX-XXXX-000000000001`. The original domain will be removed using +`XXXXXXXX-XXXX-XXXX-XXXX-000000000000`. + +There are some points called *hooks* at which `xenopsd` can execute some script. +Before starting a migration a command is send to the original domain to execute +a pre migrate script if it exists. + +Before starting the migration a command is sent to Qemu using the Qemu Machine Protocol (QMP) +to check that the domain can be suspended (see [xenopsd/xc/device_common.ml](https://github.com/xapi-project/xen-api/blob/master/ocaml/xenopsd/xc/device_common.ml)). +After checking with Qemu that the VM is suspendable we can start the migration. + +## Importing metadata + +As for *hooks*, commands to source domain are sent using [stunnel](https://github.com/xapi-project/xen-api/tree/master/ocaml/libs/stunnel) a daemon which +is used as a wrapper to manage SSL encryption communication between two hosts on the same +pool. To import metada an XML RPC command is sent to the original domain. + +Once imported it will give us a reference id and will allow to build the new domain +on the destination using the temporary VM uuid `XXXXXXXX-XXXX-XXXX-XXXX-000000000001` +where `XXX...` is the reference id of the original VM. + +## Setting memory + +One of the first thing to do is to setup the memory. The backend will check that there +is no ballooning operation in progress. At this point the migration can fail if a +ballooning operation is in progress and takes too much time. + +Once memory checked the daemon will get the state of the VM (running, halted, ...) and +information about the VM are retrieve by the backend like the maximum memory the domain +can consume but also information about quotas for example. +Information are retrieve by the backend from xenstore. + +Once this is complete, we can restore VIF and create the domain. + +The synchronisation of the memory is the first point of synchronisation and everythin +is ready for VM migration. + +## VM Migration + +After receiving memory we can set up the destination domain. If we have a vGPU we need to kick +off its migration process. We will need to wait the acknowledge that indicates that the entry +for the GPU has been well initialized. before starting the main VM migration. + +Their is a mechanism of handshake for synchronizing between the source and the +destination. Using the handshake protocol the receiver inform the sender of the +request that everything has been setup and ready to save/restore. + +### VM restore + +VM restore is a low level atomic operation [VM.restore](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L2684). +This operation is represented by a function call to [backend](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/domain.ml#L1540). +It uses **Xenguest**, a low-level utility from XAPI toolstack, to interact with the Xen hypervisor +and libxc for sending a request of migration to the **emu-manager**. + +After sending the request results coming from **emu-manager** are collected +by the main thread. It blocks until results are received. + +During the live migration, **emu-manager** helps in ensuring the correct state +transitions for the devices and handling the message passing for the VM as +it's moved between hosts. This includes making sure that the state of the +VM's virtual devices, like disks or network interfaces, is correctly moved over. + +### VM renaming + +Once all operations are done we can rename the VM on the target from its temporary +name to its real UUID. This operation is another low level atomic one +[VM.rename](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L1667) +that will take care of updating the xenstore on the destination. + +The next step is the restauration of devices and unpause the domain. + +### Restoring remaining devices + +Restoring devices starts by activating VBD using the low level atomic operation +[VBD.set_active](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L3674). It is an update of Xenstore. VBDs that are read-write must +be plugged before read-only ones. Once activated the low level atomic operation +[VBD.plug](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L3721) +is called. VDI are attached and activate. + +Next devices are VIFs that are set as active [VIF.set_active](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L4296) and plug [VIF.plug](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L4394). +If there are VGPUs we will set them as active now using the atomic [VGPU.set_active](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L3490). + +We are almost done. The next step is to create the device model + +#### create device model + +Create device model is done by using the atomic operation [VM.create_device_model](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L2375). This +will configure **qemu-dm** and started. This allow to manage PCI devices. + +#### PCI plug + +[PCI.plug](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L3399) +is executed by the backend. It plugs a PCI device and advertise it to QEMU if this option is set. It is +the case for NVIDIA SR-IOV vGPUS. + +At this point devices have been restored. The new domain is considered survivable. We can +unpause the domain and performs last actions + +### Unpause and done + +Unpause is done by managing the state of the domain using bindings to [xenctrl](https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=tools/libs/ctrl/xc_domain.c;h=f2d9d14b4d9f24553fa766c5dcb289f88d684bb0;hb=HEAD#l76). +Once hypervisor has unpaused the domain some actions can be requested using [VM.set_domain_action_request](https://github.com/xapi-project/xen-api/blob/7ac88b90e762065c5ebb94a8ea61c61bdbf62c5c/ocaml/xenopsd/xc/xenops_server_xen.ml#L3172). +It is a path in xenstore. By default no action is done but a reboot can be for example +initiated. + +Previously we spoke about some points called *hooks* at which `xenopsd` can execute some script. There +is also a hook to run a post migrate script. After the execution of the script if there is one +the migration is almost done. The last step is a handskake to seal the success of the migration +and the old VM can now be cleaned. + +# Links + +Some links are old but even if many changes occured they are relevant for a global understanding +of the XAPI toolstack. + +- [XAPI architecture](https://xapi-project.github.io/xapi/architecture.html) +- [XAPI dispatcher](https://wiki.xenproject.org/wiki/XAPI_Dispatch) +- [Xenopsd architecture](https://xapi-project.github.io/xenopsd/architecture.html)