[Devel] [PATCH RHEL7 COMMIT] ms/cfq-iosched: Fix wrong children_weight calculation

Konstantin Khorenko khorenko at virtuozzo.com
Fri Jul 17 06:42:34 PDT 2015


The commit is pushed to "branch-rh7-3.10.0-123.1.2-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-123.1.2.vz7.5.25
------>
commit 79fc0086d46ff448af2b8f94ad2bab49d54b31a7
Author: Dmitry Monakhov <dmonakhov at openvz.org>
Date:   Fri Jul 17 17:42:34 2015 +0400

    ms/cfq-iosched: Fix wrong children_weight calculation
    
    ms commit: e15693ef18e13e3e6bffe891fe140f18b8ff6d07
    
    https://jira.sw.ru/browse/PSBM-34808
    
    From: Toshiaki Makita <makita.toshiaki at lab.ntt.co.jp>
    
    cfq_group_service_tree_add() is applying new_weight at the beginning of
    the function via cfq_update_group_weight().
    This actually allows weight to change between adding it to and subtracting
    it from children_weight, and triggers WARN_ON_ONCE() in
    cfq_group_service_tree_del(), or even causes oops by divide error during
    vfr calculation in cfq_group_service_tree_add().
    
    The detailed scenario is as follows:
    1. Create blkio cgroups X and Y as a child of X.
       Set X's weight to 500 and perform some I/O to apply new_weight.
       This X's I/O completes before starting Y's I/O.
    2. Y starts I/O and cfq_group_service_tree_add() is called with Y.
    3. cfq_group_service_tree_add() walks up the tree during children_weight
       calculation and adds parent X's weight (500) to children_weight of root.
       children_weight becomes 500.
    4. Set X's weight to 1000.
    5. X starts I/O and cfq_group_service_tree_add() is called with X.
    6. cfq_group_service_tree_add() applies its new_weight (1000).
    7. I/O of Y completes and cfq_group_service_tree_del() is called with Y.
    8. I/O of X completes and cfq_group_service_tree_del() is called with X.
    9. cfq_group_service_tree_del() subtracts X's weight (1000) from
       children_weight of root. children_weight becomes -500.
       This triggers WARN_ON_ONCE().
    10. Set X's weight to 500.
    11. X starts I/O and cfq_group_service_tree_add() is called with X.
    12. cfq_group_service_tree_add() applies its new_weight (500) and adds it
        to children_weight of root. children_weight becomes 0. Calcularion of
        vfr triggers oops by divide error.
    
    weight should be updated right before adding it to children_weight.
    
    Reported-by: Ruki Sekiya <sekiya.ruki at lab.ntt.co.jp>
    Signed-off-by: Toshiaki Makita <makita.toshiaki at lab.ntt.co.jp>
    Acked-by: Tejun Heo <tj at kernel.org>
    Cc: stable at vger.kernel.org
    Signed-off-by: Jens Axboe <axboe at fb.com>
    Signed-off-by: Dmitry Monakhov <dmonakhov at openvz.org>
---
 block/cfq-iosched.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 5d8d665..a86cd68 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -1277,12 +1277,16 @@ __cfq_group_service_tree_add(struct cfq_rb_root *st, struct cfq_group *cfqg)
 static void
 cfq_update_group_weight(struct cfq_group *cfqg)
 {
-	BUG_ON(!RB_EMPTY_NODE(&cfqg->rb_node));
-
 	if (cfqg->new_weight) {
 		cfqg->weight = cfqg->new_weight;
 		cfqg->new_weight = 0;
 	}
+}
+
+static void
+cfq_update_group_leaf_weight(struct cfq_group *cfqg)
+{
+	BUG_ON(!RB_EMPTY_NODE(&cfqg->rb_node));
 
 	if (cfqg->new_leaf_weight) {
 		cfqg->leaf_weight = cfqg->new_leaf_weight;
@@ -1301,7 +1305,7 @@ cfq_group_service_tree_add(struct cfq_rb_root *st, struct cfq_group *cfqg)
 	/* add to the service tree */
 	BUG_ON(!RB_EMPTY_NODE(&cfqg->rb_node));
 
-	cfq_update_group_weight(cfqg);
+	cfq_update_group_leaf_weight(cfqg);
 	__cfq_group_service_tree_add(st, cfqg);
 
 	/*
@@ -1325,6 +1329,7 @@ cfq_group_service_tree_add(struct cfq_rb_root *st, struct cfq_group *cfqg)
 	 */
 	while ((parent = cfqg_parent(pos))) {
 		if (propagate) {
+			cfq_update_group_weight(pos);
 			propagate = !parent->nr_active++;
 			parent->children_weight += pos->weight;
 		}



More information about the Devel mailing list