[Devel] [PATCH VZ9 3/3] fs/fuse: enhanced splice support

Kui Liu kui.liu at acronis.com
Fri Jan 19 10:13:13 MSK 2024



> 
> About suggestion,  well, no. Not only this would be not simpler, it
> would be just wrong.
> All the checks and all the use must be done after fget before fput, otherwise
> user space can close the descriptor and even open something
> else in another thread.
> 

Yeah, I forgot about this concern when I suggested to do fdget/fdput in each loop,.

But how about the idea that we do fdget on every file in the first loop,  do all necessary
checks,  and then store returned fd if all checks pass or jump to out path in case of 
any failures. Then in the second loop, we can just use stored fds from the first loop
and do data copy.  In the out path, we can just  do fdput on all stored fds.  

However I’m not sure if there is any risk of holding multiple fds at the same time, or holding
one fd longer than necessary. 
 

> On Thu, Jan 18, 2024 at 8:26 PM Kui Liu <kui.liu at acronis.com> wrote:
>> 
>> 
>> 
>>> On 18 Jan 2024, at 5:08 PM, Alexey Kuznetsov <kuznet at virtuozzo.com> wrote:
>>> 
>>>> Is it difficult to separate the verification of inputs, such as fds, size of src buffs against dest buffs etc, from the process of copying data? Similar to what’s been done in ‘copy_out_args()’.
>>> 
>>> I am not sure I understand what you mean here. Show, plz.
>>> 
>>> Do you really think copy_out_args is even "readable"? :-)
>>> From my pov it is awful. Actually, I hold 4d patch in series
>>> which replaces copy_out_args with normal human intellibible
>>> function which copies data 10% faster. And I do not think
>>> it was by accident, ugly code is usually a bad code.
>> 
>> OK, what I meant was can we break the big for loop in two smaller ones.  In the first one, just verify whether inputs are valid, something like
>> 
>> +       for (i = 0; i < nsplices; i++) {
>> +               void *src, *dst;
>> +               struct fd f = fdget(fdarr[i]);
>> +
>> +               if (f.file) {
>> +                       unsigned int head, tail, mask;
>> +
>> +                       if (unlikely(f.file->f_op != fuse_g_splice_ops)) {
>> +                               if (fuse_g_splice_ops == NULL) {
>> +                                       struct file *probe_files[2];
>> +
>> +                                       if (create_pipe_files(probe_files, 0)) {
>> +                                               err = -EBADF;
>> +                                               goto out;
>> +                                       }
>> +                                       fuse_g_splice_ops = probe_files[0]->f_op;
>> +                                       fput(probe_files[0]);
>> +                                       fput(probe_files[1]);
>> +                               }
>> +                               if (unlikely(f.file->f_op != fuse_g_splice_ops)) {
>> +                                       err = -EBADF;
>> +                                       goto out;
>> +                               }
>> +                       }
>> +
>> +                       pipe = f.file->private_data;
>> +
>> +                       if (pipe == NULL) {
>> +                               err = -EBADF;
>> +                               goto out;
>> +                       }
>> +               } else {
>> +                       err = -EBADF;
>> +                       goto out;
>> +               }
>> 
>>                /* count total size of inputs */
>> 
>>                fdput(f);
>> +       }
>>        /* check whether size of inputs exceeds available buff in req->ap here */
>> 
>> 
>> after all checks pass, do data copying in the second loop.  Something like:
>> 
>>        for (i = 0; i < nsplices; i++) {
>> +               void *src, *dst;
>> +               struct fd f = fdget(fdarr[i]);
>> 
>> +               pipe = f.file->private_data;
>> +               pipe_lock(pipe);
>> +
>> +               head = pipe->head;
>> +               mask = pipe->ring_size - 1;
>> +
>> +               for (tail = pipe->tail; tail != head; tail++) {
>> +                       struct page *ipage = pipe->bufs[tail & mask].page;
>> +                       int ioff = pipe->bufs[tail & mask].offset;
>> +                       int ilen = pipe->bufs[tail & mask].len;
>> 
>> +                       src = kmap_atomic(ipage);
>> +                       while (ilen > 0) {
>> +                               int copy = ilen;
>> 
>> +                               if (doff >= dend) {
>> +                                       didx++;
>> +                                       dpage = ap->pages[didx];
>> +                                       doff = ap->descs[didx].offset;
>> +                                       dend = doff + ap->descs[didx].length;
>> +                               }
>> +
>> +                               if (copy > dend - doff)
>> +                                       copy = dend - doff;
>> +                                       dst = kmap_atomic(dpage);
>> +                                       memcpy(dst + doff, src + ioff, copy);
>> +                                       kunmap_atomic(dst);
>> +
>> +                                       doff += copy;
>> +                                       ioff += copy;
>> +                                       ilen -= copy;
>> +                               }
>>                        }
>> +                       kunmap_atomic(src);
>> +                       put_page(ipage);
>> +                       pipe->bufs[tail & mask].ops = NULL;
>> +                       pipe->bufs[tail & mask].page = NULL;
>> +                       pipe->tail = tail + 1;
>> +               }
>> +               pipe_unlock(pipe);
>> +               fdput(f);
>>        }
>> 
>> This way we can fail before starting to copy data if there’s something wrong in inputs, however it does fdget/ fdput twice, I’m not sure how expensive fdget/fdput can be.
>> If it’s expensive, maybe we can cache fdget() from the first loop and do fdput() after second loop.
>> 
>> BTW I just found out there’s no fdput() in the error path.
>> 
>> Lastly I was definitely not suggesting to do something similar in fuse_copy_args(), which I agree is not that readable.
>> 
>> 




More information about the Devel mailing list