[Devel] [PATCH RHEL9 COMMIT] ms/mm/ksm: add "smart" page scanning mode

Konstantin Khorenko khorenko at virtuozzo.com
Tue Aug 27 20:29:09 MSK 2024


The commit is pushed to "branch-rh9-5.14.0-427.31.1.vz9.70.x-ovz" and will appear at git at bitbucket.org:openvz/vzkernel.git
after rh9-5.14.0-427.31.1.vz9.68.1
------>
commit 2367ba591fd90b826aed758ef89135b1ccfa5af9
Author: Stefan Roesch <shr at devkernel.io>
Date:   Wed Aug 21 14:45:41 2024 +0800

    ms/mm/ksm: add "smart" page scanning mode
    
    Patch series "Smart scanning mode for KSM", v3.
    
    This patch series adds "smart scanning" for KSM.
    
    What is smart scanning?
    =======================
    KSM evaluates all the candidate pages for each scan. It does not use historic
    information from previous scans. This has the effect that candidate pages that
    couldn't be used for KSM de-duplication continue to be evaluated for each scan.
    
    The idea of "smart scanning" is to keep historic information. With the historic
    information we can temporarily skip the candidate page for one or several scans.
    
    Details:
    ========
    "Smart scanning" is to keep two small counters to store if the page has been
    used for KSM. One counter stores how often we already tried to use the page for
    KSM and the other counter stores how often we skip a page.
    
    How often we skip the candidate page depends how often a page failed KSM
    de-duplication. The code skips a maximum of 8 times. During testing this has
    shown to be a good compromise for different workloads.
    
    New sysfs knob:
    ===============
    Smart scanning is not enabled by default. With /sys/kernel/mm/ksm/smart_scan
    smart scanning can be enabled.
    
    Monitoring:
    ===========
    To monitor how effective smart scanning is a new sysfs knob has been introduced.
    /sys/kernel/mm/pages_skipped report how many pages have been skipped by smart
    scanning.
    
    Results:
    ========
    - Various workloads have shown a 20% - 25% reduction in page scans
      For the instagram workload for instance, the number of pages scanned has been
      reduced from over 20M pages per scan to less than 15M pages.
    - Less pages scans also resulted in an overall higher de-duplication rate as
      some shorter lived pages could be de-duplicated additionally
    - Less pages scanned allows to reduce the pages_to_scan parameter
      and this resulted in  a 25% reduction in terms of CPU.
    - The improvements have been observed for workloads that enable KSM with
      madvise as well as prctl
    
    This patch (of 4):
    
    This change adds a "smart" page scanning mode for KSM.  So far all the
    candidate pages are continuously scanned to find candidates for
    de-duplication.  There are a considerably number of pages that cannot be
    de-duplicated.  This is costly in terms of CPU.  By using smart scanning
    considerable CPU savings can be achieved.
    
    This change takes the history of scanning pages into account and skips the
    page scanning of certain pages for a while if de-deduplication for this
    page has not been successful in the past.
    
    To do this it introduces two new fields in the ksm_rmap_item structure:
    age and remaining_skips.  age, is the KSM age and remaining_skips
    determines how often scanning of this page is skipped.  The age field is
    incremented each time the page is scanned and the page cannot be de-
    duplicated.  age updated is capped at U8_MAX.
    
    How often a page is skipped is dependent how often de-duplication has been
    tried so far and the number of skips is currently limited to 8.  This
    value has shown to be effective with different workloads.
    
    The feature is currently disable by default and can be enabled with the
    new smart_scan knob.
    
    The feature has shown to be very effective: upt to 25% of the page scans
    can be eliminated; the pages_to_scan rate can be reduced by 40 - 50% and a
    similar de-duplication rate can be maintained.
    
    [akpm at linux-foundation.org: make ksm_smart_scan default true, for testing]
    Link: https://lkml.kernel.org/r/20230926040939.516161-1-shr@devkernel.io
    Link: https://lkml.kernel.org/r/20230926040939.516161-2-shr@devkernel.io
    Signed-off-by: Stefan Roesch <shr at devkernel.io>
    Reviewed-by: David Hildenbrand <david at redhat.com>
    Cc: Johannes Weiner <hannes at cmpxchg.org>
    Cc: Rik van Riel <riel at surriel.com>
    Cc: Stefan Roesch <shr at devkernel.io>
    Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
    
    https://virtuozzo.atlassian.net/browse/PSBM-157809
    (cherry picked from commit 5e924ff54d088828794d9f1a4d5bf17808f7270e)
    Signed-off-by: Pavel Tikhomirov <ptikhomirov at virtuozzo.com>
    
    ======
    Patchset description:
    ksm: port smart scanning and advisor to improve performance
    
    1) "Smart" scanning allows ksm to skip pages which didn't manage to be
    deduplicated after several iterations, it skips those pages for maximum
    8 iterations and then retries again. To enable:
    
    echo 1 > /sys/kernel/mm/ksm/smart_scan
    
    2) Ksm Advisor allows ksm to autoscale pages_to_scan based on previous
    scans data to perform full memory scan in advisor_target_scan_time
    (200s by default). It will increase scanning rate if new processes with
    more pages to deduplicate start and will decrease performance impact
    in more stable situations. To enable:
    
    echo "scan-time" /sys/kernel/mm/ksm/advisor_mode
    
    note: Don't forget to enable ksm, when using above, with:
    
    echo 1 > /sys/kernel/mm/ksm/run
    
    note: It shows greater performance on sysbench and webbench perf tests
    in vconsolidate on csus > 40.
    
    https://virtuozzo.atlassian.net/browse/PSBM-157809
    Signed-off-by: Pavel Tikhomirov <ptikhomirov at virtuozzo.com>
    
    Stefan Roesch (4):
      mm/ksm: add "smart" page scanning mode
      mm/ksm: add pages_skipped metric
      mm/ksm: add ksm advisor
      mm/ksm: add sysfs knobs for advisor
    
    Feature: ksm: smart scanning and advisor
---
 mm/ksm.c | 104 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 104 insertions(+)

diff --git a/mm/ksm.c b/mm/ksm.c
index 791268b260d3..1994db2b5014 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -52,6 +52,8 @@
 #define DO_NUMA(x)	do { } while (0)
 #endif
 
+typedef u8 rmap_age_t;
+
 /**
  * DOC: Overview
  *
@@ -189,6 +191,8 @@ struct ksm_stable_node {
  * @node: rb node of this rmap_item in the unstable tree
  * @head: pointer to stable_node heading this list in the stable tree
  * @hlist: link into hlist of rmap_items hanging off that stable_node
+ * @age: number of scan iterations since creation
+ * @remaining_skips: how many scans to skip
  */
 struct ksm_rmap_item {
 	struct ksm_rmap_item *rmap_list;
@@ -201,6 +205,8 @@ struct ksm_rmap_item {
 	struct mm_struct *mm;
 	unsigned long address;		/* + low bits used for flags below */
 	unsigned int oldchecksum;	/* when unstable */
+	rmap_age_t age;
+	rmap_age_t remaining_skips;
 	union {
 		struct rb_node node;	/* when node of unstable tree */
 		struct {		/* when listed from stable tree */
@@ -274,6 +280,10 @@ static unsigned int zero_checksum __read_mostly;
 /* Whether to merge empty (zeroed) pages with actual zero pages */
 static bool ksm_use_zero_pages __read_mostly;
 
+/* Skip pages that couldn't be de-duplicated previously */
+/* Default to true at least temporarily, for testing */
+static bool ksm_smart_scan = true;
+
 #ifdef CONFIG_NUMA
 /* Zeroed when merging across nodes is not allowed */
 static unsigned int ksm_merge_across_nodes = 1;
@@ -2211,6 +2221,73 @@ static struct ksm_rmap_item *get_next_rmap_item(struct ksm_mm_slot *mm_slot,
 	return rmap_item;
 }
 
+/*
+ * Calculate skip age for the ksm page age. The age determines how often
+ * de-duplicating has already been tried unsuccessfully. If the age is
+ * smaller, the scanning of this page is skipped for less scans.
+ *
+ * @age: rmap_item age of page
+ */
+static unsigned int skip_age(rmap_age_t age)
+{
+	if (age <= 3)
+		return 1;
+	if (age <= 5)
+		return 2;
+	if (age <= 8)
+		return 4;
+
+	return 8;
+}
+
+/*
+ * Determines if a page should be skipped for the current scan.
+ *
+ * @page: page to check
+ * @rmap_item: associated rmap_item of page
+ */
+static bool should_skip_rmap_item(struct page *page,
+				  struct ksm_rmap_item *rmap_item)
+{
+	rmap_age_t age;
+
+	if (!ksm_smart_scan)
+		return false;
+
+	/*
+	 * Never skip pages that are already KSM; pages cmp_and_merge_page()
+	 * will essentially ignore them, but we still have to process them
+	 * properly.
+	 */
+	if (PageKsm(page))
+		return false;
+
+	age = rmap_item->age;
+	if (age != U8_MAX)
+		rmap_item->age++;
+
+	/*
+	 * Smaller ages are not skipped, they need to get a chance to go
+	 * through the different phases of the KSM merging.
+	 */
+	if (age < 3)
+		return false;
+
+	/*
+	 * Are we still allowed to skip? If not, then don't skip it
+	 * and determine how much more often we are allowed to skip next.
+	 */
+	if (!rmap_item->remaining_skips) {
+		rmap_item->remaining_skips = skip_age(age);
+		return false;
+	}
+
+	/* Skip this page */
+	rmap_item->remaining_skips--;
+	remove_rmap_item_from_tree(rmap_item);
+	return true;
+}
+
 static struct ksm_rmap_item *scan_get_next_rmap_item(struct page **page)
 {
 	struct mm_struct *mm;
@@ -2312,6 +2389,10 @@ static struct ksm_rmap_item *scan_get_next_rmap_item(struct page **page)
 				if (rmap_item) {
 					ksm_scan.rmap_list =
 							&rmap_item->rmap_list;
+
+					if (should_skip_rmap_item(*page, rmap_item))
+						goto next_page;
+
 					ksm_scan.address += PAGE_SIZE;
 				} else
 					put_page(*page);
@@ -3143,6 +3224,28 @@ static ssize_t full_scans_show(struct kobject *kobj,
 }
 KSM_ATTR_RO(full_scans);
 
+static ssize_t smart_scan_show(struct kobject *kobj,
+			       struct kobj_attribute *attr, char *buf)
+{
+	return sysfs_emit(buf, "%u\n", ksm_smart_scan);
+}
+
+static ssize_t smart_scan_store(struct kobject *kobj,
+				struct kobj_attribute *attr,
+				const char *buf, size_t count)
+{
+	int err;
+	bool value;
+
+	err = kstrtobool(buf, &value);
+	if (err)
+		return -EINVAL;
+
+	ksm_smart_scan = value;
+	return count;
+}
+KSM_ATTR(smart_scan);
+
 static struct attribute *ksm_attrs[] = {
 	&sleep_millisecs_attr.attr,
 	&pages_to_scan_attr.attr,
@@ -3160,6 +3263,7 @@ static struct attribute *ksm_attrs[] = {
 	&stable_node_dups_attr.attr,
 	&stable_node_chains_prune_millisecs_attr.attr,
 	&use_zero_pages_attr.attr,
+	&smart_scan_attr.attr,
 	NULL,
 };
 


More information about the Devel mailing list