[Devel] [PATCH] fuse kio: Add comment about why we need to wait pending read requests
Pavel Butsykin
pbutsykin at virtuozzo.com
Tue Oct 23 17:37:33 MSK 2018
On 23.10.2018 16:38, Kirill Tkhai wrote:
> On 23.10.2018 16:08, Pavel Butsykin wrote:
>>
>>
>> On 23.10.2018 13:19, Kirill Tkhai wrote:
>>> On 23.10.2018 13:15, Pavel Butsykin wrote:
>>>>
>>>>
>>>> On 23.10.2018 13:01, Kirill Tkhai wrote:
>>>>> Signed-off-by: Kirill Tkhai <ktkhai at virtuozzo.com>
>>>>> ---
>>>>> fs/fuse/kio/pcs/pcs_fuse_kdirect.c | 7 ++++++-
>>>>> 1 file changed, 6 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/fs/fuse/kio/pcs/pcs_fuse_kdirect.c b/fs/fuse/kio/pcs/pcs_fuse_kdirect.c
>>>>> index 3940d6c255ba..9b45fcbf8941 100644
>>>>> --- a/fs/fuse/kio/pcs/pcs_fuse_kdirect.c
>>>>> +++ b/fs/fuse/kio/pcs/pcs_fuse_kdirect.c
>>>>> @@ -1081,7 +1081,12 @@ static void pcs_kio_setattr_handle(struct fuse_inode *fi, struct fuse_req *req)
>>>>> r->end = req->end;
>>>>> if (di->size.op == PCS_SIZE_SHRINK) {
>>>>> BUG_ON(!mutex_is_locked(&req->io_inode->i_mutex));
>>>>> - /* wait for aio reads in flight */
>>>>> + /*
>>>>> + * Wait for already submitted aio reads. Further reads
>>>>> + * (including already queued to bg_queue) will be stopped
>>>>> + * by wait_shrink(), and they will be processed from
>>>>> + * _pcs_shrink_end().
>>>>> + */
>>>>
>>>> Why do we have to wait already submitted aio reads ?
>>>
>>> Because there could be grow requests. This place is broken, I'm fixing that at the moment.
>>> One-dimensional size.op is not enough to drive race between grow and shrink, we need to
>>> convert PCS_SIZE_* into bit fields like it used to be before.
>>>
>>
>> I don't understand. There are no chances to races between grow and
>> shrink, because the shrink is protected from simultaneous execution with
>> write requests at the fuse level. No write requests - no grows.
>
> Not exactly. Currently we have fallocate requests, which may spawn a grow operation.
> Even FALLOC_FL_KEEP_SIZE may do that -- then inode mutex is not held. So, the question
> is not so easy, what is correct, and which is place should be fixed.
>
So, we have two places where possible fallocate without mutex:
1.
static size_t fuse_send_unmap(struct fuse_req *req, struct fuse_io_priv *io,
loff_t pos, size_t count, fl_owner_t owner)
{
..
inarg->mode = FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE;
req->in.h.opcode = FUSE_FALLOCATE;
..
if (io->async)
return fuse_async_req_send(fc, req, count, io);
fuse_request_send(fc, req);
return count;
}
It's definitely a mistake, we should protect this fallocate
from simultaneous execution with shrink.
2.
static long fuse_file_fallocate(struct file *file, int mode, loff_t offset,
loff_t length)
{
...
bool lock_inode = !(mode & FALLOC_FL_KEEP_SIZE) ||
(mode & (FALLOC_FL_PUNCH_HOLE|FALLOC_FL_ZERO_RANGE));
/* Return error if mode is not supported */
if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE |
FALLOC_FL_ZERO_RANGE))
return -EOPNOTSUPP;
...
So, fallocate without mutex only possible
if (mode & FALLOC_FL_KEEP_SIZE) == FALLOC_FL_KEEP_SIZE.
Hmm, what does it mean ? Looks like it will be the same as
FALLOC_FL_ZERO_RANGE|FALLOC_FL_KEEP_SIZE.
I wonder, Do we need to handle such fallocate for vstorage at all?
It seems that NO, it's just PCS_IREQ_NOOP.
Alexey?
>>>>> inode_dio_wait(req->io_inode);
>>>>> /*
>>>>> * Writebackcache was flushed already so it is safe to
>>>>>
More information about the Devel
mailing list