[Devel] [RH7 PATCH 0/6] RFC ploop: Barrier fix patch set v3

Maxim Patlasov mpatlasov at virtuozzo.com
Thu Jun 23 15:28:20 PDT 2016


Dima,

On 06/23/2016 10:25 AM, Dmitry Monakhov wrote:
> Here is 3'rd version of barrier fix patches based on recent fixes.
> This is an RFC version. I do not have time to test it before tomorrow,
> Max please review is briefly and tell be your oppinion about general idea.

It's hard to review w/o context, and the series fail to apply to our vz7 
tree. So, I spent pretty long while trying to find a tag or commit where 
it's possible to apply your patches w/o rejects. The first patch wants 
PLOOP_REQ_ALLOW_READS in ploop.h:

> @@ -471,6 +471,7 @@ enum
>      PLOOP_REQ_POST_SUBMIT, /* preq needs post_submit processing */
>      PLOOP_REQ_PUSH_BACKUP, /* preq was ACKed by userspace push_backup */
>      PLOOP_REQ_ALLOW_READS, /* READs are allowed for given req_cluster */
> +    PLOOP_REQ_DEL_CONV,    /* post_submit: conversion required */
>      PLOOP_REQ_FSYNC_DONE,  /* fsync_thread() performed f_op->fsync() */
>  };

We removed ALLOW_READS by 06e7586 (Jun 3), so you must have 
rh7-3.10.0-327.18.2.vz7.14.11 or earlier. But the third patch has:

> @@ -562,7 +551,6 @@ dio_submit_pad(struct ploop_io *io, struct 
> ploop_request * preq,
>      sector_t sec, end_sec, nsec, start, end;
>      struct bio_list_walk bw;
>      int err;
> -
>      bio_list_init(&bl);
>
>      /* sec..end_sec is the range which we are going to write */

while after applying the first and the second, it looks like:

> static void
> dio_submit_pad(struct ploop_io *io, struct ploop_request * preq,
>            struct bio_list * sbl, unsigned int size,
>            struct extent_map *em)
> {
>     struct bio_list bl;
>     struct bio * bio = NULL;
>     sector_t sec, end_sec, nsec, start, end;
>     struct bio_list_walk bw;
>     int err;
>     int preflush = !!(preq->req_rw & REQ_FLUSH);
>
>     bio_list_init(&bl);

So the tree you used didn't have that "int preflush = !!(preq->req_rw & 
REQ_FLUSH);" line. But the patch removing this line was committed only 
yesterday, Jun 22 (c2247f3745).

After applying c2247f3745 before your series, another conflict happens 
while applying the third patch: the first hunk assumes the following 
lines in dio_submit:

>      rw &= ~(REQ_FLUSH | REQ_FUA);
> -
> -
>      bio_list_init(&bl);

But we always (since 2013) had:

>     rw &= ~(REQ_FLUSH | REQ_FUA);
>
>
>     /* In case of eng_state != COMPLETE, we'll do FUA in
>      * ploop_index_update(). Otherwise, we should mark
>      * last bio as FUA here. */
>     if (rw & REQ_FUA) {
>         rw &= ~REQ_FUA;
>         if (preq->eng_state == PLOOP_E_COMPLETE)
>             postfua = 1;
>     }
>
>     bio_list_init(&bl);

Hence, I drew conclusion, we need one of your previous patches to be 
applied at first. After very long row of trials and errors I eventually 
succeeded:

$ git show c2247f3745 > /tmp/my.diff
$ git checkout rh7-3.10.0-327.18.2.vz7.14.11
$ patch -p1 < ploop-fix-barriers-for-ordinary-requests
$ patch -p1 < ploop-skip-redundant-fsync-for-REQ_FUA-in-post_submit
$ patch -p1 < ploop-deadcode-cleanup
$ patch -p1 < ploop-generalize-post_submit-stage
$ patch -p1 < ploop-generalize-issue_flush
$ patch -p1 < ploop-add-delayed-flush-support
$ patch -p1 < ploop-io_kaio-support-PLOOP_REQ_DEL_FLUSH
$ patch -p1 < ploop-fixup-barrier-handling-during-relocation
$ patch -p1 < patch-ploop_state_debugging.patch


> Basic idea is to use post_submit state to issue empty FLUSH barrier in order
> to complete FUA requests. This allow us to unify all engines (direct and kaio).
>
> This makes FUA processing optimal:
> SUBMIT:FUA       :W1{b1,b2,b3,b4..},WAIT,post_submit:FLUSH
> SUBMIT_ALLOC:FUA :W1{b1,b2,b3,b4..},WAIT,post_submit:FLUSH, WBI:FUA

The above would be optimal only if all three statements below are true:

1) Lower layers process FUA as post-FLUSH.

I remember that you wrote that in real (linux kernel) life it is always 
true, but somehow I'm not sure about this "always"... Of course, we can 
investigate and eventually nail down the question, but for now I'm not 
convinced.

2) The list of (incoming) bio-s has more than one marked as FUA. 
Otherwise (i.e. if only one is FUA), it must be equivalent (from 
performance perspective): to submit FUA now, or to submit FLUSH later 
(modulo 1) above).

For now, I know only two entities generating FUA: ext4 writes superblock 
and jbd2 commits transaction. In both cases it is one page-sized bio and 
no way to have more than one FUA in queue. Do you know other cases?
(Of course, we know about fsync(), but it generates zero-size FLUSH. The 
way how ploop processes it is not affected by the patches we discussed.)

3) The benefits of delayed FLUSH must overweight the performance loss 
we'll have from extra WAIT introduced. I have not verified it yet, but I 
suspect it must be observable (e.g. by blktrace) that one page-sized bio 
marked as FLUSH|FUA completes faster than: submit one page-sized bio 
marked as FLUSH, wait for completion, submit zero-size FLUSH, wait for 
completion again. Makes sense?

If the three statements above are correct, and given some complexity 
added by the series, and possible performance loss due to extra WAIT, I 
think we have to find out if it's really possible to have many FUA-s in 
queue often enough (at least in some situations under some load), and 
then measure the performance before and after this "DEL_FLUSH" 
optimization to make decision. Please, do not take it as I dislike your 
idea, I'm ready to be involved in investigation and performance 
measurements. Thank you for your time and hard work.

Max

> RELOC_S: R1, W2,WAIT,post_submit:FLUSH, WBI:FUA
> RELOC_A: R1, W2,WAIT,post_submit:FLUSH, WBI:FUA, W1:NULLIFY,WAIT,post_submit:FLUSH
>
>
> #POST_SUBMIT CHANGES:
> ploop-generalize-post_submit-stage.patch
> ploop-generalize-issue_flush.patch
> ploop-add-delayed-flush-support.patch
> ploop-io_kaio-support-PLOOP_REQ_DEL_FLUSH.patch
> #RELOC_XXX FIXES
> ploop-fixup-barrier-handling-during-relocation.patch
> patch-ploop_state_debugging.patch.patch
>
>



More information about the Devel mailing list