[Devel] [PATCH RHEL8 COMMIT] userfaultfd: wp: add helper for writeprotect check

Konstantin Khorenko khorenko at virtuozzo.com
Mon Apr 20 10:34:28 MSK 2020


The commit is pushed to "branch-rh8-4.18.0-80.1.2.vz8.3.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh8-4.18.0-80.1.2.vz8.3.6
------>
commit 7078f501e994c7f0a19e892f709141b9345b8988
Author: Shaohua Li <shli at fb.com>
Date:   Mon Apr 20 10:34:28 2020 +0300

    userfaultfd: wp: add helper for writeprotect check
    
    Patch series "userfaultfd: write protection support", v6.
    
    Overview
    ========
    
    The uffd-wp work was initialized by Shaohua Li [1], and later continued by
    Andrea [2].  This series is based upon Andrea's latest userfaultfd tree,
    and it is a continuous works from both Shaohua and Andrea.  Many of the
    follow up ideas come from Andrea too.
    
    Besides the old MISSING register mode of userfaultfd, the new uffd-wp
    support provides another alternative register mode called
    UFFDIO_REGISTER_MODE_WP that can be used to listen to not only missing
    page faults but also write protection page faults, or even they can be
    registered together.  At the same time, the new feature also provides a
    new userfaultfd ioctl called UFFDIO_WRITEPROTECT which allows the
    userspace to write protect a range or memory or fixup write permission of
    faulted pages.
    
    Please refer to the document patch "userfaultfd: wp:
    UFFDIO_REGISTER_MODE_WP documentation update" for more information on the
    new interface and what it can do.
    
    The major workflow of an uffd-wp program should be:
    
      1. Register a memory region with WP mode using UFFDIO_REGISTER_MODE_WP
    
      2. Write protect part of the whole registered region using
         UFFDIO_WRITEPROTECT, passing in UFFDIO_WRITEPROTECT_MODE_WP to
         show that we want to write protect the range.
    
      3. Start a working thread that modifies the protected pages,
         meanwhile listening to UFFD messages.
    
      4. When a write is detected upon the protected range, page fault
         happens, a UFFD message will be generated and reported to the
         page fault handling thread
    
      5. The page fault handler thread resolves the page fault using the
         new UFFDIO_WRITEPROTECT ioctl, but this time passing in
         !UFFDIO_WRITEPROTECT_MODE_WP instead showing that we want to
         recover the write permission.  Before this operation, the fault
         handler thread can do anything it wants, e.g., dumps the page to
         a persistent storage.
    
      6. The worker thread will continue running with the correctly
         applied write permission from step 5.
    
    Currently there are already two projects that are based on this new
    userfaultfd feature.
    
    QEMU Live Snapshot: The project provides a way to allow the QEMU
                        hypervisor to take snapshot of VMs without
                        stopping the VM [3].
    
    LLNL umap library:  The project provides a mmap-like interface and
                        "allow to have an application specific buffer of
                        pages cached from a large file, i.e. out-of-core
                        execution using memory map" [4][5].
    
    Before posting the patchset, this series was smoke tested against QEMU
    live snapshot and the LLNL umap library (by doing parallel quicksort using
    128 sorting threads + 80 uffd servicing threads).  My sincere thanks to
    Marty Mcfadden and Denis Plotnikov for the help along the way.
    
    TODO
    ====
    
    - hugetlbfs/shmem support
    - performance
    - more architectures
    - cooperate with mprotect()-allowed processes (???)
    - ...
    
    References
    ==========
    
    [1] https://lwn.net/Articles/666187/
    [2] https://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git/log/?h=userfault
    [3] https://github.com/denis-plotnikov/qemu/commits/background-snapshot-kvm
    [4] https://github.com/LLNL/umap
    [5] https://llnl-umap.readthedocs.io/en/develop/
    [6] https://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git/commit/?h=userfault&id=b245ecf6cf59156966f3da6e6b674f6695a5ffa5
    [7] https://lkml.org/lkml/2018/11/21/370
    [8] https://lkml.org/lkml/2018/12/30/64
    
    This patch (of 19):
    
    Add helper for writeprotect check. Will use it later.
    
    Signed-off-by: Shaohua Li <shli at fb.com>
    Signed-off-by: Andrea Arcangeli <aarcange at redhat.com>
    Signed-off-by: Peter Xu <peterx at redhat.com>
    Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
    Reviewed-by: Jerome Glisse <jglisse at redhat.com>
    Reviewed-by: Mike Rapoport <rppt at linux.vnet.ibm.com>
    Cc: Rik van Riel <riel at redhat.com>
    Cc: Kirill A. Shutemov <kirill at shutemov.name>
    Cc: Mel Gorman <mgorman at suse.de>
    Cc: Hugh Dickins <hughd at google.com>
    Cc: Johannes Weiner <hannes at cmpxchg.org>
    Cc: Bobby Powers <bobbypowers at gmail.com>
    Cc: Brian Geffon <bgeffon at google.com>
    Cc: David Hildenbrand <david at redhat.com>
    Cc: Denis Plotnikov <dplotnikov at virtuozzo.com>
    Cc: "Dr . David Alan Gilbert" <dgilbert at redhat.com>
    Cc: Martin Cracauer <cracauer at cons.org>
    Cc: Marty McFadden <mcfadden8 at llnl.gov>
    Cc: Maya Gokhale <gokhale2 at llnl.gov>
    Cc: Mike Kravetz <mike.kravetz at oracle.com>
    Cc: Pavel Emelyanov <xemul at openvz.org>
    Link: http://lkml.kernel.org/r/20200220163112.11409-2-peterx@redhat.com
    Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>
    
    https://jira.sw.ru/browse/PSBM-102938
    (cherry picked from commit 1df319e0b4dee11436fe2ab1a0d536d3fad7cfef)
    Signed-off-by: Andrey Ryabinin <aryabinin at virtuozzo.com>
---
 include/linux/userfaultfd_k.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h
index e091f0a11b11..52dcb3beaa5e 100644
--- a/include/linux/userfaultfd_k.h
+++ b/include/linux/userfaultfd_k.h
@@ -50,6 +50,11 @@ static inline bool userfaultfd_missing(struct vm_area_struct *vma)
 	return vma->vm_flags & VM_UFFD_MISSING;
 }
 
+static inline bool userfaultfd_wp(struct vm_area_struct *vma)
+{
+	return vma->vm_flags & VM_UFFD_WP;
+}
+
 static inline bool userfaultfd_armed(struct vm_area_struct *vma)
 {
 	return vma->vm_flags & (VM_UFFD_MISSING | VM_UFFD_WP);
@@ -93,6 +98,11 @@ static inline bool userfaultfd_missing(struct vm_area_struct *vma)
 	return false;
 }
 
+static inline bool userfaultfd_wp(struct vm_area_struct *vma)
+{
+	return false;
+}
+
 static inline bool userfaultfd_armed(struct vm_area_struct *vma)
 {
 	return false;


More information about the Devel mailing list