[Devel] [PATCH RHEL7 COMMIT] kvm/x86: skip async_pf when in guest mode

Vasily Averin vvs at virtuozzo.com
Wed Dec 7 08:11:48 PST 2016


Den,
it's for our customers,
both for VZ7-rtm (vz7.15.2) and VZ7-u1 (vz7.18.7) kernels 
Is it important for customers?

thank you,
	Vasily Averin

On 12/07/2016 07:06 PM, Denis V. Lunev wrote:
> On 12/07/2016 06:27 PM, Konstantin Khorenko wrote:
>> Please consider to RK.
>>
>> Den, let us know if you don't think it's needed.
>>
> which branch you are speaking about? For RK in UP3?
> How we can be sure that all QA nodes will be updated
> in this case? They are running not released kernels.
> 
> Den
> 
>> -- 
>> Best regards,
>>
>> Konstantin Khorenko,
>> Virtuozzo Linux Kernel Team
>>
>> On 12/02/2016 05:35 PM, Konstantin Khorenko wrote:
>>> The commit is pushed to "branch-rh7-3.10.0-327.36.1.vz7.20.x-ovz" and
>>> will appear at https://src.openvz.org/scm/ovz/vzkernel.git
>>> after rh7-3.10.0-327.36.1.vz7.20.9
>>> ------>
>>> commit 5173f45a28cdf3d5808e236eab882273a760a363
>>> Author: Roman Kagan <rkagan at virtuozzo.com>
>>> Date:   Fri Dec 2 18:35:41 2016 +0400
>>>
>>>     kvm/x86: skip async_pf when in guest mode
>>>
>>>     Async pagefault machinery assumes communication with L1 guests
>>> only: all
>>>     the state -- MSRs, apf area addresses, etc, -- are for L1. 
>>> However, it
>>>     currently doesn't check if the vCPU is running L1 or L2, and may
>>> inject
>>>
>>>     To reproduce the problem, use a host with swap enabled, run a VM
>>> on it,
>>>     run a nested VM on top, and set RSS limit for L1 on the host via
>>>    
>>> /sys/fs/cgroup/memory/machine.slice/machine-*.scope/memory.limit_in_bytes
>>>     to swap it out (you may need to tighten and release it once or
>>> twice, or
>>>     create some memory load inside L1).  Very quickly L2 guest starts
>>>     receiving pagefaults with bogus %cr2 (apf tokens from the host
>>>     actually), and L1 guest starts accumulating tasks stuck in D
>>> state in
>>>     kvm_async_pf_task_wait.
>>>
>>>     To avoid that, only do async_pf stuff when executing L1 guest.
>>>
>>>     Note: this patch only fixes x86; other async_pf-capable arches
>>> may also
>>>     need something similar.
>>>
>>>     Signed-off-by: Roman Kagan <rkagan at virtuozzo.com>
>>>     Signed-off-by: Radim Krčmář <rkrcmar at redhat.com>
>>>     (cherry picked from commit 80e2a7bb8d7050d2ea6d8961c526a65d30d5eb08)
>>>
>>>     https://jira.sw.ru/browse/PSBM-54491
>>> ---
>>>  arch/x86/kvm/mmu.c | 2 +-
>>>  arch/x86/kvm/x86.c | 3 ++-
>>>  2 files changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
>>> index 17973ed..c82bf5f 100644
>>> --- a/arch/x86/kvm/mmu.c
>>> +++ b/arch/x86/kvm/mmu.c
>>> @@ -3481,7 +3481,7 @@ static bool try_async_pf(struct kvm_vcpu *vcpu,
>>> bool prefault, gfn_t gfn,
>>>      if (!async)
>>>          return false; /* *pfn has correct page already */
>>>
>>> -    if (!prefault && can_do_async_pf(vcpu)) {
>>> +    if (!prefault && !is_guest_mode(vcpu) && can_do_async_pf(vcpu)) {
>>>          trace_kvm_try_async_get_page(gva, gfn);
>>>          if (kvm_find_async_pf_gfn(vcpu, gfn)) {
>>>              trace_kvm_async_pf_doublefault(gva, gfn);
>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>> index 78ea28c..4edeb8a 100644
>>> --- a/arch/x86/kvm/x86.c
>>> +++ b/arch/x86/kvm/x86.c
>>> @@ -6780,7 +6780,8 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
>>>              ++vcpu->stat.request_irq_exits;
>>>          }
>>>
>>> -        kvm_check_async_pf_completion(vcpu);
>>> +        if (!is_guest_mode(vcpu))
>>> +            kvm_check_async_pf_completion(vcpu);
>>>
>>>          if (signal_pending(current)) {
>>>              r = -EINTR;
>>> .
>>>
> 
> 


More information about the Devel mailing list