Wednesday, October 14, 2015

On Instance Migration in HOS 1.1.1

First of all, at the moment, HOS 1.1.1 time frame, there are several restrictions of Instance migration feature, and
among them, recently I was looking into the issue below:

  - Under HOS 1.1.1, we cannot live migrate (*1) BFV VM instances (regardless of they have attached additional cinder volumes or not)
    while it’s possible in the upstream stable/juno

(*1) Here, I don’t mean block migration (‘nova live-migration --block-migrate’)
        that is disabled in upstream since Juno in case of  VM instances with cinder volumes.

(*2) My customer does not use shared file system nor ceph for VM instance store. But, natural expectation is
       that it’s possible to do live migration  
  
Today, I got a break through with respect to Live Migration of BFV instances.

In short, with respect to Live Migration of Instances with Cinder volumes (including BFV), after Juno release there are some bug fixes
for the case above that HOS 1.1.1 does not include.

Thus, we need to patch Nova like the following in order to address capability of Live Migration.

Here, please note that I’m talking about *capability* of HOS 1.1.1.
I know the Vancouver Summit Live Migration session from the HP Public Cloud team, and
it’s another issue if we provide live migration feature for production usage or not.


The below note is a summary of what I did.

------

Necessary modifications to enable Live Migration of BFV Instances

We need to backport (at least) the following 2 patches from Nova stable/juno branch.

(1) patch1
commit 42cae28241cd0c213201d036bfbe13fb118e4bee
Author: Cyril Roelandt <cyril.roelandt@enovance.com>
Date:   Mon Aug 18 17:45:35 2014 +0000

    libvirt: Make sure volumes are well detected during block migration

(2) patch2
commit 5a0711dbffe3d68ee9be39c85307b19ea5efee7a
Author: Daniel Genin <Daniel.Genin@jhuapl.edu>
Date:   Mon Nov 17 15:16:14 2014 -0500

    libvirt: Fixes live migration for volume backed instances


(3) live migration related nova.conf flag

In addition, we need to take care of ‘live_migration_flags’ in ‘/etc/nova/nova.conf’
This can be specified using files under ‘hp_passthrough’.

Use the following flags for ‘live_migration_flags’ in [libvirt] section of /etc/nova/nova.conf
Especially, NOT using  ‘VIR_MIGRATE_PAUSED’ is important.


[libvirt]
live_migration_flag=VIR_MIGRATE_UNDEFINE_SOURCE,VIR_MIGRATE_PEER2PEER,VIR_MIGRATE_LIVE,VIR_MIGRATE_TUNNELLED

* Reference:

*Notes:

By defaults, HOS tripleo uses the following flags for ‘live_migration_flags’ parameter.

live_migration_flag=VIR_MIGRATE_LIVE,VIR_MIGRATE_TUNNELLED,VIR_MIGRATE_PEER2PEER,VIR_MIGRATE_PAUSED

The above ‘VIR_MIGRATE_PAUSED’ makes the migrated VM instances under ‘paused’ status (in libvirt layer).

  1) nova-compute outputs the following ‘WARNING’ line and just ignores it.

2015-06-17 04:43:22.706 34940 WARNING nova.compute.manager [-] [instance: 8a312c4a-1d5c-4135-964f-de2116ab02ba] Instance is paused unexpectedly. Ignore.

Thus,

  2) Indeed the VM is migrated to the destination host, but the VM cannot be used from the customers.

In addition,

  3) Nova says that the VM instances are ‘ACTIVE’ status. This is confusing because end users cannot use the VM instances anymore
     unless administrators un-paused them by using ‘virsh resume’ on the compute node.



No comments:

Post a Comment