[Devel] [PATCH RHEL7 COMMIT] kvm/x86: skip async_pf when in guest mode
Konstantin Khorenko
khorenko at virtuozzo.com
Wed Dec 7 07:27:00 PST 2016
Please consider to RK.
Den, let us know if you don't think it's needed.
--
Best regards,
Konstantin Khorenko,
Virtuozzo Linux Kernel Team
On 12/02/2016 05:35 PM, Konstantin Khorenko wrote:
> The commit is pushed to "branch-rh7-3.10.0-327.36.1.vz7.20.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
> after rh7-3.10.0-327.36.1.vz7.20.9
> ------>
> commit 5173f45a28cdf3d5808e236eab882273a760a363
> Author: Roman Kagan <rkagan at virtuozzo.com>
> Date: Fri Dec 2 18:35:41 2016 +0400
>
> kvm/x86: skip async_pf when in guest mode
>
> Async pagefault machinery assumes communication with L1 guests only: all
> the state -- MSRs, apf area addresses, etc, -- are for L1. However, it
> currently doesn't check if the vCPU is running L1 or L2, and may inject
>
> To reproduce the problem, use a host with swap enabled, run a VM on it,
> run a nested VM on top, and set RSS limit for L1 on the host via
> /sys/fs/cgroup/memory/machine.slice/machine-*.scope/memory.limit_in_bytes
> to swap it out (you may need to tighten and release it once or twice, or
> create some memory load inside L1). Very quickly L2 guest starts
> receiving pagefaults with bogus %cr2 (apf tokens from the host
> actually), and L1 guest starts accumulating tasks stuck in D state in
> kvm_async_pf_task_wait.
>
> To avoid that, only do async_pf stuff when executing L1 guest.
>
> Note: this patch only fixes x86; other async_pf-capable arches may also
> need something similar.
>
> Signed-off-by: Roman Kagan <rkagan at virtuozzo.com>
> Signed-off-by: Radim Krčmář <rkrcmar at redhat.com>
> (cherry picked from commit 80e2a7bb8d7050d2ea6d8961c526a65d30d5eb08)
>
> https://jira.sw.ru/browse/PSBM-54491
> ---
> arch/x86/kvm/mmu.c | 2 +-
> arch/x86/kvm/x86.c | 3 ++-
> 2 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 17973ed..c82bf5f 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -3481,7 +3481,7 @@ static bool try_async_pf(struct kvm_vcpu *vcpu, bool prefault, gfn_t gfn,
> if (!async)
> return false; /* *pfn has correct page already */
>
> - if (!prefault && can_do_async_pf(vcpu)) {
> + if (!prefault && !is_guest_mode(vcpu) && can_do_async_pf(vcpu)) {
> trace_kvm_try_async_get_page(gva, gfn);
> if (kvm_find_async_pf_gfn(vcpu, gfn)) {
> trace_kvm_async_pf_doublefault(gva, gfn);
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 78ea28c..4edeb8a 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -6780,7 +6780,8 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
> ++vcpu->stat.request_irq_exits;
> }
>
> - kvm_check_async_pf_completion(vcpu);
> + if (!is_guest_mode(vcpu))
> + kvm_check_async_pf_completion(vcpu);
>
> if (signal_pending(current)) {
> r = -EINTR;
> .
>
More information about the Devel
mailing list