KVM: x86: hyper-v: Don't touch TSC page values when guest opted for re-enlightenment
When guest opts for re-enlightenment notifications upon migration, it is in its right to assume that TSC page values never change (as they're only supposed to change upon migration and the host has to keep things as they are before it receives confirmation from the guest). This is mostly true until the guest is migrated somewhere. KVM userspace (e.g. QEMU) will trigger masterclock update by writing to HV_X64_MSR_REFERENCE_TSC, by calling KVM_SET_CLOCK,... and as TSC value and kvmclock reading drift apart (even slightly), the update causes TSC page values to change. The issue at hand is that when Hyper-V is migrated, it uses stale (cached) TSC page values to compute the difference between its own clocksource (provided by KVM) and its guests' TSC pages to program synthetic timers and in some cases, when TSC page is updated, this puts all stimer expirations in the past. This, in its turn, causes an interrupt storm and L2 guests not making much forward progress. Note, KVM doesn't fully implement re-enlightenment notification. Basically, the support for reenlightenment MSRs is just a stub and userspace is only expected to expose the feature when TSC scaling on the expected destination hosts is available. With TSC scaling, no real re-enlightenment is needed as TSC frequency doesn't change. With TSC scaling becoming ubiquitous, it likely makes little sense to fully implement re-enlightenment in KVM. Prevent TSC page from being updated after migration. In case it's not the guest who's initiating the change and when TSC page is already enabled, just keep it as it is: TSC value is supposed to be preserved across migration and TSC frequency can't change with re-enlightenment enabled. The guest is doomed anyway if any of this is not true. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210316143736.964151-5-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Please register or sign in to comment