[Devel] [PATCH RHEL7 COMMIT] kvm/x86: skip async_pf when in guest mode

Konstantin Khorenko khorenko at virtuozzo.com
Wed Dec 7 07:27:00 PST 2016


Please consider to RK.

Den, let us know if you don't think it's needed.

--
Best regards,

Konstantin Khorenko,
Virtuozzo Linux Kernel Team

On 12/02/2016 05:35 PM, Konstantin Khorenko wrote:
> The commit is pushed to "branch-rh7-3.10.0-327.36.1.vz7.20.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
> after rh7-3.10.0-327.36.1.vz7.20.9
> ------>
> commit 5173f45a28cdf3d5808e236eab882273a760a363
> Author: Roman Kagan <rkagan at virtuozzo.com>
> Date:   Fri Dec 2 18:35:41 2016 +0400
>
>     kvm/x86: skip async_pf when in guest mode
>
>     Async pagefault machinery assumes communication with L1 guests only: all
>     the state -- MSRs, apf area addresses, etc, -- are for L1.  However, it
>     currently doesn't check if the vCPU is running L1 or L2, and may inject
>
>     To reproduce the problem, use a host with swap enabled, run a VM on it,
>     run a nested VM on top, and set RSS limit for L1 on the host via
>     /sys/fs/cgroup/memory/machine.slice/machine-*.scope/memory.limit_in_bytes
>     to swap it out (you may need to tighten and release it once or twice, or
>     create some memory load inside L1).  Very quickly L2 guest starts
>     receiving pagefaults with bogus %cr2 (apf tokens from the host
>     actually), and L1 guest starts accumulating tasks stuck in D state in
>     kvm_async_pf_task_wait.
>
>     To avoid that, only do async_pf stuff when executing L1 guest.
>
>     Note: this patch only fixes x86; other async_pf-capable arches may also
>     need something similar.
>
>     Signed-off-by: Roman Kagan <rkagan at virtuozzo.com>
>     Signed-off-by: Radim Krčmář <rkrcmar at redhat.com>
>     (cherry picked from commit 80e2a7bb8d7050d2ea6d8961c526a65d30d5eb08)
>
>     https://jira.sw.ru/browse/PSBM-54491
> ---
>  arch/x86/kvm/mmu.c | 2 +-
>  arch/x86/kvm/x86.c | 3 ++-
>  2 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 17973ed..c82bf5f 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -3481,7 +3481,7 @@ static bool try_async_pf(struct kvm_vcpu *vcpu, bool prefault, gfn_t gfn,
>  	if (!async)
>  		return false; /* *pfn has correct page already */
>
> -	if (!prefault && can_do_async_pf(vcpu)) {
> +	if (!prefault && !is_guest_mode(vcpu) && can_do_async_pf(vcpu)) {
>  		trace_kvm_try_async_get_page(gva, gfn);
>  		if (kvm_find_async_pf_gfn(vcpu, gfn)) {
>  			trace_kvm_async_pf_doublefault(gva, gfn);
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 78ea28c..4edeb8a 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -6780,7 +6780,8 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
>  			++vcpu->stat.request_irq_exits;
>  		}
>
> -		kvm_check_async_pf_completion(vcpu);
> +		if (!is_guest_mode(vcpu))
> +			kvm_check_async_pf_completion(vcpu);
>
>  		if (signal_pending(current)) {
>  			r = -EINTR;
> .
>


More information about the Devel mailing list